Separatome-based protein expression and purification platform

ABSTRACT

Provided is a separatome-based recombinant peptide, polypeptide, and protein expression and purification platform based on the juxtaposition of the binding properties of host cell genomic peptides, polypeptides, and proteins with the characteristics and location of the corresponding genes on the host cell chromosome, such as that of E. coli, yeast, Bacillus subtilis or other prokaryotes, insect cells, mammalian cells, etc. The separatome-based protein expression and purification platform quantitatively describes and identifies priority deletions, modifications, or inhibitions of certain gene products to increase chromatographic separation efficiency, defined as an increase in column capacity, column selectivity, or both, with emphasis on the former. Moreover, the separatome-based protein expression and purification platform provides a computerized knowledge tool that, given separatome data and a target recombinant peptide, polypeptide, or protein, intuitively suggests strategies leading to efficient product purification. The separatome-based protein expression and purification platform is an efficient bioseparation system that intertwines host cell expression systems and chromatography.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a Continuation of U.S. application Ser. No.14/521,845, filed Oct. 23, 2014, now U.S. Pat. No. 9,816,068, issued onNov. 14, 2017; which is a Continuation of U.S. application Ser. No.14/056,747, filed Oct. 17, 2013, now U.S. Pat. No. 8,927,231, issuedJan. 6, 2015; which is a Continuation of PCT application Ser. No.PCT/US2013/030549, filed Mar. 12, 2013; which claims the benefit ofpriority of U.S. provisional application Ser. No. 61/610,298, filed Mar.13, 2012. This application claims the benefit of priority of each of thelisted prior applications, and the contents of each of these priorapplications are herein incorporated by reference in their entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT ANDJOINT RESEARCH AGREEMENT DISQUALIFICATION UNDER THE CREATE ACT(COOPERATIVE RESEARCH AND TECHNOLOGY ENHANCEMENT ACT OF 2004 (CREATEACT) (PUB. L. 108-453, 118 STAT. 3596 (2004))

This invention was made with government support under grants Nos.0534836, 0533949, 1237252, 1142101, and 1048911, awarded by the NationalScience Foundation. The U.S. government has certain rights in theinvention.

The present invention was collaboratively made by scientists from theUniversity of Arkansas and the University of Pittsburgh under theabove-noted joint NSF grants that were in effect on or before the datethe presently claimed invention was made. The claimed invention was madeas a result of activities undertaken within the scope of the jointresearch agreement. The term “joint research agreement” means the jointNSF research grants awarded to the above-noted parties for theperformance of experimental, developmental, or research work in thefield of the claimed invention.

BACKGROUND OF THE INVENTION Field of the Invention

The present invention relates to a proteomics-based protein expressionand purification platform, more particularly a single cell line, or setof cell lines, designed by manipulating the separatomes associated withvarious separation techniques, in particular column chromatography, thatcan be used in a wide variety of processes for the expression ofrecombinantly produced peptides, polypeptides, and proteins, and to thesubsequent rapid, efficient, and economical recovery thereof in highyield, thereby eliminating the need to develop individualized host cellsfor each purification process.

DESCRIPTION OF RELATED ART

Current society is heavily dependent on mass-manufactured peptides,polypeptides and proteins that are used in everything from cancertreatment medications to laundry detergents. More than 325 millionpeople worldwide have been helped by the over 155 recombinantly producedpolypeptides and peptides (drugs and vaccines) currently approved by theUnited States Food and Drug Administration. In addition, there are morethan 370 biotechnology drug products and vaccines currently in clinicaltrials targeting more than 200 diseases, including various cancers,Alzheimer's disease, heart disease, diabetes, multiple sclerosis,immunodeficiency, and arthritis. Enzymes used in industrial processesclaim approximately a $2.7 billion dollar market, with an expectedgrowth to a value of $6 billion dollars by 2016. Of the approximately3000 industrial enzymes in use today for applications in biotechnology,food, fuel, and pulp and paper industries, about one-third of these areproduced in recombinant bacteria.

Manufacturing of therapeutically useful peptides, polypeptides, andproteins has been hampered, in large part, by the limitations of theorganisms currently used to express these molecules, and of the oftenextensive recovery steps necessary as the final product is isolated.Recombinant protein expression is the preferred, predominant method forthe manufacture of these pharmaceuticals, herein referred to as a“biologic” to differentiate them, in particular, both from chemicallysynthesized therapeutics (e.g., antihistamines or CNS drugs) and fromindustrial enzymes such as pectinases or restriction endonucleases, forexample. In general, the purification of a biologic to within tolerablelimits is the most costly stage of manufacturing and validation, withthe burden of regulation placed upon it by the Food and DrugAdministration (FDA) or similar (inter)national entity. Recombinant DNAtechniques, hybridoma technologies, mammalian cell culturing, metabolicengineering, and fermentation improvements have permitted large-scaleproduction of biologics.

As large-scale production issues are solved, manufacturing steps thatlimit productivity are shifted downstream. In an effort to quickentime-to-clinic and market, research efforts have focused on cuttingmaterial costs, improving productivity at large-scale, and developingrobust, generic separation steps. In the biologics manufacturingprocess, cell lines are cultivated to produce, or express, the biologic;during this process, the desired biologic is expressed alongsideunwanted host cell proteins. These contaminants then have to beseparated from the biologic through expensive and time-consumingmulti-step purification processes that often include centrifugation,ultrafiltration, extraction, precipitation, and the cornerstone ofbioseparation, chromatographic separation. Since downstream processesaccount for 50% to 80% of total manufacturing costs, efforts to optimizepurification of high-value, high-quality products are critical tosuccess in the biopharmaceutical industry. For example, if there is amodest 5% loss of biologic per purification step, final yields of about70% are encountered should the processing require 5-8 downstream steps.This overall loss is intolerable as market demands for biologicsincrease. End-uses for peptides, polypeptides, and proteins producedrecombinantly, other than biologics, include, but are not limited to,diagnostic kits (e.g., glucose dehydrogenase for glucose sensing),enabling technologies (e.g. ligases for recombinant DNA efforts),consumer products (e.g., proteases for laundry soap), manufacturing(e.g., isomerases for production of corn syrup), and biofuel generation(e.g., cellulases for switchgrass processing). Materials of theseproduct categories also suffer from the desire for efficient downstreamprocessing, although their product validation is less stringent than fora biologic.

For the illustrations above, both recovery from the culture andpurification are paramount. Challenges to the industry standardtechnique of column chromatography, a critical element to mostbioseparation schemes, are dictated by lack of separation efficiency,the variety of chromatography separation media, and the diversecomposition of the mobile phase. Lack of separation efficiency manifestsitself predominantly as a reduction in column capacity, defined as theamount of target molecule bound per adsorption cycle, and selectivity,defined as the amount of target molecule bound divided by the totalamount of material bound per adsorption cycle. The traditional method ofaddressing separation efficiency is empirical, and is driven by pastexperience because no software design tool, similar to CHEMCAD (chemicalengineering) and SPICE (electrical engineering), for bioseparationprocess design exists in the public domain, if at all. Therefore, anyimprovements in the recovery of peptides, polypeptides, or proteins interms of an increase in separation efficiency, column capacity inparticular, have been traditionally gained by improvements in theproperties of the chromatographic adsorbent, by artful design of thegradient used to elicit separation, or in some cases, by the enhancementof binding through the addition of His₆, maltose binding protein, Arg₈,or similarly designed affinity tails or tags. Although affinity tails ortags are widely used for purification of recombinant proteins, inparticular through the use of His₆, the continued presence of genomicpeptides, polypeptides, and proteins exhibiting affinity for the resinsused in these chromatographic methods remains problematic. Notably, whenhost cell genomic peptides, polypeptides, and proteins are retained inthe adsorption step, significant losses in column capacity andcomplications in gradient elution occur. Selection of companionchromatographic steps in a rational manner to increase separationefficiency, i.e., separation capacity (product recovery), separationselectivity (product purity), or both, is nearly impossible due to lackof knowledge regarding the contaminant species, and is thereforedeveloped somewhat arbitrarily, requiring tedious, time-consuming, andexpensive trial and error experimentation.

As disclosed herein, one route to supplement traditional means to aid inthe purification of peptides, polypeptides, or proteins would be toalter the proteome of the host cell in order to reduce the burden ofhost cell contaminant adsorption. This concept is orthogonal to theseries of patents and applications by Blattner et al. that disclose anumber of different strains of E. coli engineered to contain reducedgenomes—in contrast to the proteome—to facilitate the production ofrecombinant proteins (U.S. Pat. Nos. 8,178,339; 8,119,365; 8,043,842;8,039,243; 7,303,906; 6,989,265; U.S. 20120219994A1; and EP1483367B1).U.S. Pat. No. 8,119,365 claims E. coli wherein the genome is between4.41 Mb and 2.78 Mb. U.S. Pat. No. 8,043,842 claims E. coli wherein thegenome is between 4.27 Mb and 4.00 Mb. U.S. Pat. No. 8,039,243 claimsvariously between 4.41 and 3.71 Mb, 4.31 Mb and 3.71 Mb, and 4.27 Mb and3.71 Mb. U.S. Pat. No. 6,989,265 discloses E. coli wherein the genome isat least 5% to at least 14% smaller than the genome of its native parentstrain. EP1483367B1 claims E. coli having a chromosome that isgenetically engineered to be 5% to 40% smaller than the chromosome ofits native parent E. coli strain.

These documents variously discuss the concepts of reduced genome E. colifor use in the production of recombinant proteins, improving recombinantprotein expression in E. coli by improving the growth/yield propertiesand robustness as a recombinant host by eliminating large numbers ofnon-essential genes and improving E. coli transformation competence.Expression of endogenous/native proteins in host cells is also presumedto be reduced. None of these documents either discloses or discusseschromatographic purification procedures, or the optimization thereof inconjunction with the design of optimized host cells, to improveseparation efficiency leading to a purified or partially purified targetpeptide, polypeptide, or protein.

U.S. 2009/0075352 discloses the use of in silico comparative metabolicand genetic engineering analyses to improve the production of usefulsubstances in host strains by comparing the genomic information of atarget strain for producing a useful substance to the genomicinformation of a strain that overproduces the useful substance byscreening for, and by deleting genes unnecessary for the overproductionof the useful substance, thereby improving product yield. As in the caseof the patent documents discussed above, this application does notdisclose or discuss chromatographic purification procedures, or theoptimization thereof to improve separation efficiency leading to atarget peptide, polypeptide, or protein.

Yu et al. (2002) Nature Biotechnol. 20:1018-1023 discloses a method fordetermining essential genes in E. coli and minimizing the bacterialgenome by deleting large genomic fragments, thereby deleting genes thatare nonessential under a given set of growth conditions and identifyinga minimized set of essential E. coli genes and DNA sequences. Neitherthe term “chromatography” nor “purification” is mentioned.

U.S. application 2012/0183995 discloses genetic modification of Bacillusspecies to improve the capacity to produce expressed proteins ofinterest, wherein one or more chromosomal genes are inactivated ordeleted, or wherein one or more indigenous chromosomal regions aredeleted from a corresponding wild-type Bacillus host chromosome. Thisincludes removing large regions of chromosomal DNA in a Bacillus hoststrain wherein the deleted indigenous chromosomal region is notnecessary for strain viability. These modifications enhance the abilityof an altered Bacillus strain to express a higher level of a protein ofinterest over a corresponding non-altered Bacillus host strain. Thisapplication does not discuss improved chromatographic separation ofexpressed target recombinant peptides, polypeptides, or proteins fromendogenous Bacillus proteins.

Asenjo et al. (2004), “Is there a rational method to purify proteins?From expert systems to proteomics”, Journal of Molecular Recognition17:236-247, discusses optimizing protein purification steps based onknowledge of the physicochemical properties of the target proteinproduct and the protein contaminants. The paper notes “the rule of thumbthat reflects the logic of first separating impurities present in higherconcentrations.” The concept of reduced genome host cells is notdisclosed.

While the previously mentioned patents and journal articles do notdisclose or discuss chromatographic purification procedures, otherreferences either outline the general process by which data on host cellproteins that interact with chromatography media can be obtained, orfocus on the elimination of product-specific impurities through geneknockout. Cai et al. (2004) Biotechnol. Bioeng. 88:77 and Tiwari et al.(2010) Protein Expression and Purification 70:191-195 disclose theapplication of cellular extracts of E. coli to various affinity andnon-affinity chromatographic media, and the identification of adsorbedproteins by mass spectroscopy and 2D gel electrophoresis. While themetabolic characteristics of the proteins encountered were discussed,these references do not disclose any indications of improvement inseparation efficiency. Liu et al. (2009) J. Chromatog. A 1216:2433-2438,Bartlow el al. (2011) Protein Expression and Purification 78:216-224,and Bartlow et al. (2012) American Institute of Chemical EngineersBiotechnol. Prog. 28:137-145 disclose the potential for improvement inproduct quality, purity in particular, should genes that expressproteins that co-elute with histidine-extended Green Fluorescent Proteinbe deleted from the chromosome of E. coli. The quantitative data in thisseries of papers do not disclose or suggest improvements that lead to anincrease in column capacity, nor do they demonstrate improvements thatpoint to a universally applicable host strain with improved properties,useful for producing a variety of different peptides, polypeptides, orproteins, be they extended with an affinity tail or tag (or not).Indeed, should the genes identified and deemed important in Liu et al.(2009), supra, be deleted, an increase of significantly less than onepercent (1%) in column capacity would be achieved. A similar argumentfor the deletion of genes responsible for product-specific contaminantsapplies to Caparon et al. (2010) Biotechnol. Bioeng. 105(2):239-249.This article discloses four specific gene deletions that improve thepurity of the final biologic, since three of the proteins co-elute withthe target and a fourth causes proteolytic degradation of the biologic.Lacking in this reference is a means of applying quantitative metrics toprioritize efforts that lead to increases in separation efficiencyindependent of target peptides, polypeptides, and proteins, and a methodto interpret these data to prepare a host cell or set of host cells thatprovide increases in separation efficiency for as many different targetmolecules as possible.

In view of the foregoing, there exists a need for improved methods forrecovering in quantity, and purifying, recombinant target peptides,polypeptides, and proteins from E. coli and other host cells routinelyused for recombinant expression of, for example, therapeuticproteinaceous molecules and industrial enzymes. Development ofbioseparation regimens can be challenging, requiring somewhat arbitrarytrial and error combination of conventional chromatographic methods. Thepresence of host cell peptides, polypeptides, and proteins reducesseparation step efficiency (adsorption and elution), and the tradeoffbetween overall yield and purity may not be optimal. Alternately,although the use of an affinity tail helps reduce the chromatographicspace explored, it can still be plagued by co-adsorbing /co-elutingmolecules, requiring further purification steps; addition/removal of theaffinity tail via digestion steps; and cost (ligand and endonuclease).

The methods and host cells of the present invention address theseproblems and meet these needs. The present invention provides a novelroute to supplement or supplant conventional methods to aid in thepurification of target recombinant peptides, polypeptides, and proteins.This is accomplished by providing a rational scheme for altering theproteome of host cells used for expression in order to reduce the burdenof adsorption of host cell peptides, polypeptides, and proteins thatinterfere with target molecule recovery and purification. This isaccomplished by first identifying the separatome, defined as asub-proteome associated with a separation technique, columnchromatography for example, through a formal method that mathematicallyprioritizes specific modifications to the proteome via, for example,gene knockout, gene silencing, gene modification, or gene inhibition.Host cells, or sets of host cells, of the present invention display areduced separatome, the properties of which lead to an increase incolumn capacity as peptides, polypeptides, or proteins with highaffinity are eliminated first. Uniquely focusing on host cell peptides,polypeptides, or proteins with high affinity, rather than those withaffinity similar to, or less than a presumed target recombinantmolecule, facilitates a set of modifications that are useful forimproving separation efficiency for a range of peptides, polypeptides,or proteins. Such high affinity host cell peptides, etc., areproblematic regardless of the nature of the target recombinant moleculebecause not only can they display an elution profile that may decreasepurity, but they also remain bound to the column due to the stringentconditions necessary for their desorption.

The separatome-based protein expression and purification platformdisclosed herein provides the benefits of, but is not be limited to,reduction of the chromatography regimen, column capacity loss due tohost cell contaminating peptide, polypeptide, and protein adsorption,and complexity of elution protocols since the number, and nature, ofinterfering peptides, polypeptides, and proteins to be resolved is less.

The present separatome-based protein expression and purificationplatform facilitates the modification of unoptimized host cell lines inorder to eliminate the expression of undesirable, interfering peptides,polypeptides, and proteins during host cell cultivation, therebyreducing the total amount and cost of purification needed to produce ahigher concentration, and absolute amount, of purified targetrecombinant product.

The separatome-based invention disclosed herein further provides aproteomics-based protein expression and purification platform based on acomputer database and modeling system of separatome data forindividually customized cell lines that facilitate recovery andpurification of difficult to express, low yield proteins.

The separatome-based expression and purification platform disclosedherein also provides for modified host cell lines having a genomeencoding and/or expressing a reduced number of nuisance or contaminatingproteins, thereby decreasing the complexity and costs of thepurification process.

Furthermore, the present invention provides a separatome-basedexpression and purification platform that utilizes an engineered seriesof broadly applicable bacterial and other host cells to provide facilepurification systems for target recombinant peptide, polypeptide, andprotein separation.

Compared to previous approaches involving the deletion of large numbersof host cell genes, the separatome-based method for designing host cellsfor expression of target peptides, polypeptides, and proteins providedherein is more “surgical”, i.e., targeted and precise, and does notresult in the deletion of large regions of host cell genomes. Thepresent invention provides a rational framework for optimizing targetrecombinant peptide, polypeptide, or protein recovery and purificationbased on identification of host cell peptide, polypeptide, and proteincontaminants that reduce the separation efficiency, i.e., separationcapacity (product recovery), separation selectivity (product purity), orboth, of target recombinant peptides, polypeptides, and proteins basedon knowledge of the binding characteristics of contaminating speciesduring chromatographic purification. This permits the coordinated designof universally useful, optimized host cells for target recombinantpeptide, polypeptide, or protein expression and concomitant purificationprocedures using the smallest number of operations, and eliminates theneed for arbitrary, tedious, time-consuming, and expensive trial anderror experimentation. The methods disclosed herein avoid the need todesign individualized host cell expression and chromatographic systemsfor specific recombinant target proteinaceous products, and provide arational “separatomic” procedure and materials to eliminate and separatethe main interfering peptide, polypeptide, and protein components ofhost cells using the minimum number of process steps. The presentmethods and host cells minimize, or in most cases, completely avoid theproblems of eliminating host cell genes and proteins required forgrowth, viability, and target molecule expression that would adverselyaffect the use of such cells for expression of target recombinantpeptides, polypeptides, and proteins. In some cases, the presentengineered host cells exhibit improved growth, viability, and expressioncompared to the parental cells from which they are derived. This can beattributed, at least in part, to avoiding the problem of eliminatinggenes that are dispensable individually, but not in combination.

SUMMARY OF THE INVENTION

The present invention provides a separatome-based protein expression andpurification platform comprising a system of separatome data for a hostcell, which comprises data compiled on the genome and proteome sequencesof the host cell, and a data visualization tool for graphicallydisplaying such separatome data for identification and/or modificationof contiguous or individual regions of nuisance or coeluting proteins ofhost cells. The separatome data can comprise data compiled on themetalloproteome and metabolome of the host cell. Host cells included inthis platform include, for example, Escherichia coli, yeasts, Bacillussubtilis and other prokaryotes, and any of the other host cellsconventionally used for expression of peptides, polypeptides, andproteins disclosed herein.

The system of separatome data is based on identified, conserved genomicregions of host cells that span resin- and gradient-specificchromatographies based on a relationship of binding properties of thepeptides, polypeptides, and proteins encoded by the identified,conserved genomic regions for these chromatographies with thecharacteristics and location of genes on the chromosome(s) of hostcells. The chromatographies include Immobilized-Metal AffinityChromatography (IMAC), cation exchange chromatography (cation IEX),anion exchange chromatography (anion IEX), Hydrophobic InteractionChromatography (HIC), or combinations thereof.

The present invention also encompasses a separatome-based proteinexpression and purification process for manufacturing of a modified cellline having a genome encoding a reduced number of contaminatingpeptides, polypeptides and proteins, wherein the process comprises thesteps of:

(1) graphically displaying a separatome of a target host cell line as avisualization tool in conjunction with relevant biochemical information;

(2) identifying specific genes coding for contaminating peptides,polypeptides, and proteins for the target host cell line, and/oridentifying specific genes encoding particular nuisance peptides,polypeptides, and proteins of the target host cell line;

(3) identifying, when possible, large contiguous genomic regions codingfor contaminating peptides, polypeptides, and proteins for the targethost cell line, and/or identifying specific genes encoding particularnuisance peptides, polypeptides, and proteins of the target host cellline;

(4) deleting the large contiguous genomic regions coding forcontaminating peptides, polypeptides, and proteins, and/or the specificgenes encoding particular nuisance peptides, polypeptides, and proteins,of the target host cell line from the genome of the target host cell bylarge scale or targeted knockout, respectively; and

(5) deleting regions encoding any contaminant peptides, polypeptides, orproteins remaining in the genome of the target host cell after step (3)by gene specific knockout and/or PCR point mutation to form the modifiedcell line.

The target host cell can be selected from Escherichia coli, yeasts,Bacillus subtilis or other prokaryotes, or any of the other host cellsconventionally used for expression disclosed herein.

In this process, the separatome is a system of chromatographic data ofthe juxtaposition of binding properties of peptides, polypeptides, andproteins encoded by identified, conserved genomic regions forchromatography methods with the characteristics and location of genes onthe chromosome of the target host cell. The chromatographic methods ofthis process comprise Immobilized-Metal Affinity Chromatography (IMAC),cation exchange chromatography (cation IEX), anion exchangechromatography (anion IEX), Hydrophobic Interaction Chromatography(HIC), or combinations thereof.

In this process, step (1) further comprises identifying thecontaminating proteins as essential and nonessential peptides,polypeptides, and proteins of the target host cell. Coding regions(genes) for essential peptides, polypeptides, and proteins can bereintroduced into the genome of the target host cell. The process canfurther comprise the step of constructing a larger fragment homologousto the target host cell. The fragment can be linear and sequenced withessential genes, and further comprises marker selection and selectionremoval.

The present invention also provides optimized strains of Escherichiacoli modified by a separatome-based peptide, polypeptide, and proteinexpression and purification process, wherein the strain comprises agenome having (encoding) a reduced number of nuisance or coelutingpeptides, polypeptides, and proteins. The separatome-based peptide,polypeptide, and protein expression and purification process can be atwo-step purification process based on chromatotomes of combinations ofchromatographies of Escherichia coli, and the nuisance or coelutingproteins can be reduced via large scale knockout, gene specificknockout, PCR point mutation, or a combination thereof.

More particularly, the present invention encompasses the following:

1. A host cell for expression of a target recombinant peptide,polypeptide, or protein, comprising:

i) a reduced genome compared to the genome in the parent cell from whichit is derived, or

ii) a modified genome compared to the genome in the parent cell fromwhich it is derived, or

iii) in which expression of genes is reduced or completely inhibitedcompared to expression of said genes in the parent cell from which it isderived,

wherein genes that are deleted, modified, or the expression of which isreduced or completely inhibited in said host cell, code for peptides,polypeptides, or proteins that impair the chromatographic separationefficiency of said target recombinant peptide, polypeptide, or proteinexpressed in said host cell.

2. The host cell of 1, wherein said chromatographic separationefficiency of said target recombinant peptide, polypeptide, or proteinis improved compared to the chromatographic separation efficiency ofsaid target recombinant peptide, polypeptide, or protein in the presenceof peptides, polypeptides, or proteins coded for by said genes that aredeleted, modified, or the expression of which is reduced or completelyinhibited in said host cell upon affinity or adsorption, non-affinitycolumn chromatography of said target recombinant peptide, polypeptide,or protein.3. The host cell of 2, wherein improvement of said chromatographicseparation efficiency of said target recombinant peptide, polypeptide,or protein is in the range of from about 5% to about 35%, or from about10% to about 20%, compared to chromatographic separation efficiency ofsaid target recombinant peptide, polypeptide, or protein in the presenceof peptides, polypeptides, or proteins coded for by said genes that aredeleted, modified, or the expression of which is reduced or completelyinhibited in said host cell upon affinity or adsorption, non-affinitycolumn chromatography of said target recombinant peptide, polypeptide,or protein.4. The host cell of any one of 1-3, wherein said chromatographicseparation efficiency is independent of elution conditions under whichsaid target recombinant peptide, polypeptide, or protein emerges from anaffinity or adsorption, non-affinity chromatography column as anenriched fraction.5. The host cell of any one of 1-4, wherein deletion of said gene isperformed by homologous recombination.6. The host cell of any one of 1-4, wherein modification of said genesis performed by a method selected from the group consisting of pointmutation, isozyme substitution, and transposon mutagenesis.7. The host cell of any one of 1-4, wherein expression of said genes isreduced or completely inhibited by a method selected from the groupconsisting of RNA silencing, antisense oligonucleotide inhibition, andreplacement of a native promoter with a weaker promoter.8. The host cell of any one of 1-7, which exhibits about 75% to about100% of the viability, growth rate, or capacity for expression of saidtarget recombinant peptide, polypeptide, or protein expressed in saidhost cell compared to that of said parent cell from which it is derived,or which exhibits viability, growth rate, or capacity for expression ofsaid target recombinant peptide, polypeptide, or protein expressed insaid host cell greater than that of said parent cell from which it isderived.9. The host cell of any one of 1-8, wherein said target recombinantpeptide, polypeptide, or protein is present in a lysate of said hostcell, or is secreted by said host cell.10. The host cell of any one of 1-9, wherein said target recombinantpeptide, polypeptide, or protein is an endogenous peptide, polypeptide,or protein.11. The host cell of 10, wherein said endogenous peptide, polypeptide,or protein is selected from the group consisting of a nuclease, aligase, a polymerase, an RNA- or DNA-modifying enzyme, acarbohydrate-modifying enzyme, an isomerase, a proteolytic enzyme, and alipolytic enzyme.12. The host cell of any one of 1-9, wherein said target recombinantpeptide, polypeptide, or protein is a heterologous peptide, polypeptide,or protein.13. The host cell of 12, wherein said heterologous peptide, polypeptide,or protein is selected from the group consisting of an enzyme and atherapeutic peptide, polypeptide, or protein.14. The host cell of 13, wherein said enzyme is selected from the groupconsisting of a nuclease, a ligase, a polymerase, an RNA- orDNA-modifying enzyme, a carbohydrate-modifying enzyme, an isomerase, aproteolytic enzyme, and a lipolytic enzyme, and said therapeuticpeptide, polypeptide, or protein is selected from the group consistingof antibody, an antibody fragment, a vaccine, an enzyme, a growthfactor, a blood clotting factor, a hormone, a nerve factor, aninterferon, an interleukin, tissue plasminogen activator, and insulin.15. The host cell of any one of 1-14, which is selected from the groupconsisting of a bacterium, a fungus, a mammalian cell, an insect cell, aplant cell, and a protozoal cell.16. The host cell of 15, wherein said bacterium is E. coli, B. subtilis,P. fluorescens, or C. glutamicum; said fungus is a yeast selected fromthe group consisting of S. cerevisiae and K. pastoris; said mammaliancell is a CHO cell or a HEK cell; said insect cell is an S. frugiperdacell; said plant cell is a tobacco, alfalfa, rice, tomato, or soybeancell; and said protozoal cell is a L. tarentolae cell.17. The host cell of 16, wherein said bacterium is E. coli.18. The E. coli host cell of 17, wherein said parent cell from whichsaid E. coli host cell is derived is selected from the group consistingof E. coli K-12, E. coli MG, E. coli BL, and E. coli DH.19. The host cell of 16, wherein said bacterium is B. subtilis.20. The B. subtilis host cell of 19, wherein said parent cell from whichsaid B. subtilis host cell is derived is selected from the groupconsisting of B. subtilis 168 and B. subtilis BSn5.21. The host cell of 16, wherein said S. cerevisiae and K. pastoris areselected from the group consisting of S. cerevisiae S288c and AWRI796,and K. pastoris CBS7435 and GS115, respectively.22. The host cell of 16, wherein said CHO cell is CHO-K1 and said HEKcell is HEK 293.23. The E. coli parent cell of 18, which is selected from the groupconsisting of E. coli K-12, E. coli MG1655, E. coli BL21 (DE3), and E.coli DH10B.24. E. coli strain K-12, MG1655, BL21 (DE3), and DH10B of 23, having agenome comprising the nucleotide sequence disclosed in the reference ofTable Entry Number 1, 2, 3, and 4, respectively, in Table 1.25. B. subtilis strain 168 and BSn5 of 20, having a genome comprisingthe nucleotide sequence disclosed in the reference of Table Entry Number1 and 2, respectively, in Table 2.26. S. cerevisiae strain S288c and AWRI796 of 21, having a genomecomprising the nucleotide sequence disclosed in the reference of TableEntry Number 1 and 2, respectively, in Table 3.27. K. pastoris strain CBS7435 and GS115 of 21, having a genomecomprising the nucleotide sequence disclosed in the reference of TableEntry Number 1 and 2, respectively, in Table 4.28. CHO cell strain CHO-K1 of 22, having a genome comprising thenucleotide sequence disclosed in the reference of Table Entry Number 1in Table 5.29. HEK cell strain HEK 293 of 22, having a genome comprising thenucleotide sequence disclosed in the reference of Table Entry Number 1in Table 6.30. The E. coli host cell of any one of 16-18 or 23-24, wherein saidreduced genome compared to the genome in the parent cell from which itis derived is less than 5% smaller, less than about 4.5% smaller, lessthan about 4% smaller, less than about 3.5% smaller, less than about 3%smaller, less than about 2.5% smaller, less than about 2% smaller, lessthan about 1.5% smaller, or less than about 1% smaller, than the genomeof said parent cell from which it is derived.31. The E. coli host cell of any one of 16-18 or 23-24, wherein saidreduced genome compared to the genome in the parent cell from which itis derived is between about 4.17 Mb to about 4.346 Mb.32. An E. coli host cell for expression of a target recombinant peptide,polypeptide, or protein, comprising:

i) a reduced genome compared to the genome in the parent cell from whichit is derived, or

ii) a modified genome compared to the genome in the parent cell fromwhich it is derived, or

iii) in which expression of genes is reduced or completely inhibitedcompared to expression of said genes in the parent cell from which it isderived,

wherein said parent cell is E. coli strain K-12, MG1655, BL21 (DE3), orDH10B, having a genome comprising the nucleotide sequence disclosed inthe reference of Table Entry Number 1, 2, 3, and 4, respectively, inTable 1, and

wherein genes that are deleted, modified, or the expression of which isreduced or completely inhibited in said host cell compared to expressionof said genes in said parent cell from which it is derived, code forproteins that impair the chromatographic separation efficiency of saidtarget recombinant peptide, polypeptide, or protein expressed in saidhost cell in the presence of peptides, polypeptides, or proteins codedfor by said genes that are deleted, modified, or the expression of whichis reduced or completely inhibited in said host cell, and that elutefrom a chromatographic affinity column having a ligand, in a buffercomprising a compound that dictates adsorption to its respective ligandduring equilibration and elution from said affinity column, in an amountin the range, in a combination selected from the group consisting of thecombinations in the following table A:

TABLE A Compound in Buffer That Dictates Adsorption to Affinity ColumnDuring Equilibration and Causes Ligand Elution From Column Concentrationor pH Range Glutathione S- Glutathione from about 0 mM to about 10 mMtransferase Amino acid A common salt from about 0 mM to about 2M (e.g.,lysine) Amino acid pH from about pH 2 to about pH 11 Avidin A chaotropicsalt from about 0M to about 4M Avidin pH from about pH 2 to about pH10.5 Carbohydrate Sugar or isocratic from about 0 mM to about 10 mM(e.g., Dextrin) (e.g., maltose) Carbohydrate pH from about pH 5 to aboutpH 8 Organic dye A common salt from about 0 mM to about 1.5M (e.g.,Cibacron Blue) Organic dye pH from about pH 4 to about pH 8 Organic dyeImidazole from about 5 mM to about 250 mM or a common salt Divalentmetal pH from about pH 4 to about pH 12 (e.g., Ni(II)) Divalent metalImidazole from about 5 mM to about 500 mM (e.g., Ni(II)) Heparin Acommon salt from about 0 mM to about 2M Protein A or Protein G Glycinefrom about 0 mM to about 100 mM Protein A or Protein G pH from about pH3 to about pH 7 IgG Glycine from about 0 mM to about 100 mM CoenzymeCompeting Protein from about 1 mM to about 12 mM33. An E. coli host cell for expression of a target recombinant peptide,polypeptide, or protein, comprising:

i) a reduced genome compared to the genome in the parent cell from whichit is derived, or

ii) a modified genome compared to the genome in the parent cell fromwhich it is derived, or

iii) in which expression of genes is reduced or completely inhibitedcompared to expression of said genes in the parent cell from which it isderived,

wherein said parent cell is E. coli strain K-12, MG1655, BL21 (DE3), orDH10B, having a genome comprising the nucleotide sequence disclosed inthe reference of Table Entry Number 1, 2, 3, and 4, respectively, inTable 1,

wherein genes that are deleted, modified, or the expression of which isreduced or completely inhibited in said host cell, code for host cellpeptides, polypeptides, or proteins that impair the chromatographicseparation efficiency of said target recombinant peptide, polypeptide,or protein expressed in said host cell, and

wherein genes that are deleted, modified, or the expression of which isreduced or completely inhibited in said host cell compared to expressionof said genes in said parent cell from which it is derived, code forproteins that impair the chromatographic separation efficiency of saidtarget recombinant peptide, polypeptide, or protein expressed in saidhost cell in the presence of peptides, polypeptides, or proteins codedfor by said genes that are deleted, modified, or the expression of whichis reduced or completely inhibited in said host cell, and that elutefrom a chromatographic adsorption, non-affinity column having a ligand,in a buffer comprising a compound that dictates adsorption to itsrespective ligand during equilibration and elution from said adsorption,non-affinity column, in an amount in the range, in a combinationselected from the group consisting of the combinations in the followingtable B:

TABLE B Compound in Buffer That Dictates Adsorption to Non- AffinityColumn During Equilibration and Causes Elution From Ligand ColumnConcentration or pH Range Ion exchange Common salt from about 0M toabout 2M Ion exchange pH from about pH 2 to about pH 12 Reverse Organicsolvent ex. from about 0% to about 100% phase Acetonitrile HydrophobicCommon salt from about 2M to about 0M interaction34. The E. coli host cell of 32 or 33, wherein said common salt isselected from the group consisting of a chloride salt, a sulfate salt,an acetate salt, a carbonate salt, and a propionate salt.35. The E. coli host cell of 33, wherein said organic solvent isselected from the group consisting of acetonitrile, methanol, and2-propanol.36. The E. coli host cell of 33, wherein genes that are deleted,modified, or the expression of which is inhibited, in the genome of saidE. coli host cell are selected from the group consisting of the geneslisted in Table C:

TABLE C GeneName rpoC rpoB hldD metH entF mukB tgt rnr glgP recC ycaOglnA ptsI metE sucA hrpA groL gatZ speA thiI nusA tufA degP clpB rapAmetL ycfD nagD ilvA fusA cyaA gldA dnaK ygiC gyrA glnE carB ppsA degQusg ilvB thrS recB entB dusA typA prs cysN atpD purLand combinations thereof.37. The E. coli host cell of 33, wherein said parent cell E. coli strainis MG1655 (genotype: Wild Type: F-, λ, rph-1), and the followingcombinations of genes are deleted, modified, or the expression of whichis inhibited: LTS00 (genotype: ΔthyA); LTS01+(genotype: ΔmetH); LTS01(genotype: ΔthyAΔmetH); LTS02+(genotype: ΔmetHΔentF); LTS02 (genotype:ΔthyAΔmetHΔentF); LTS03+(genotype: ΔmetHΔentFΔtgt); LTS03 (genotype:ΔthyAΔmetHΔentFΔtgt); LTS04+(genotype: ΔmetHΔentFΔtgtΔrnr); LTS04(genotype: ΔthyAΔmetHΔentFΔtgtΔrnr); or LTS05+(genotype:ΔmetHΔentFΔtgtΔrnrΔycaO).38. The host cell of any one of 1-37, wherein increased separationefficiency is manifested as increased separation capacity, increasedseparation selectivity, or both.39. The host cell of 38, wherein separation capacity is defined as theamount of target recombinant peptide, polypeptide, or protein adsorbedto said column per mass lysate in the case where said target recombinantpeptide, polypeptide, or protein is not secreted, or mass culture mediumin the case where said target recombinant peptide, polypeptide, orprotein is secreted, applied to said column, and separation selectivityis defined as the amount of target recombinant peptide, polypeptide, orprotein adsorbed to said column per total peptide, polypeptide, orprotein adsorbed to said column.40. The host cell of 38 or 39, wherein said increased separationcapacity is in the range of from about 5% to about 35%.41. The host cell of any one of 1-40, wherein separation of said targetrecombinant peptide, polypeptide, or protein from host cell peptides,polypeptides, or proteins is performed by column chromatographyemploying a solid phase chromatography medium.42. The host cell of 41, wherein said column chromatography is selectedfrom the group consisting of affinity chromatography employing anaffinity ligand bound to said solid phase, and adsorption-based,non-affinity chromatography.43. The host cell of 42, wherein said affinity ligand is selected fromthe group consisting of an amino acid, a divalent metal ion, acarbohydrate, an organic dye, a coenzyme; glutathione S-transferase,avidin, heparin, protein A, and protein G.44. The host cell of 43, wherein said divalent metal ion is selectedfrom the group consisting of Cu++, Ni++, Co++, and Zn++; saidcarbohydrate is selected from the group consisting of maltose,arabinose, and glucose; said organic dye is a dye comprising a triazenemoiety; and said coenzyme is selected from the group consisting of NADHand ATP.45. The host cell of 42, wherein said adsorption-based, non-affinitychromatography is selected from the group consisting of ion exchangechromatography, reverse phase chromatography, and hydrophobicinteraction chromatography.46. The host cell of 45, wherein said adsorption-based, non-affinitychromatography is ion exchange chromatography.47. The host cell of 46, wherein said ion exchange chromatographyemploys a ligand selected from the group consisting of diethylaminoethylcellulose (DEAE), monoQ, and S.48. The host cell of any one of 41 to 47, wherein said host cellpeptides, polypeptides, or proteins that impair separation efficiency ofsaid target recombinant peptide, polypeptide, or protein expressed insaid host cell are peptides, polypeptides, or proteins that are stronglyretained during column chromatography.49. The host cell of 48, wherein said host cell peptides, polypeptides,or proteins that are strongly retained during ion exchangechromatography are those that are retained during elution with a mobilephase comprising a common salt in the range of from about 5 mM to about2,000 mM.50. The host cell of 49, wherein said host cell peptides, polypeptides,or proteins that are strongly retained during ion exchangechromatography are those that are retained during elution with a mobilephase comprising a common salt in the range of from about 500 mM toabout 1,000 mM.51. The host cell of any one of 41 to 50, wherein said host cellpeptides, polypeptides, or proteins that impair the separationefficiency of said target recombinant peptide, polypeptide, or proteinexpressed in said host cell are peptides, polypeptides, or proteins thatare weakly retained during column chromatography.52. The host cell of 50, wherein said host cell peptides, polypeptides,or proteins that are weakly retained during chromatography are thosethat are retained during elution with a mobile phase comprising a commonsalt in the range of from about 5 mM to about 500 mM.53. The host cell of 52, wherein said host cell peptides, polypeptides,or proteins that are weakly retained during chromatography are thosethat are retained during elution with a mobile phase comprising a commonsalt in the range of from about 10 mM to about 350 mM.54. The host cell of any one of 41 to 53, wherein said host cellpeptides, polypeptides, or proteins that impair the separationefficiency of said target recombinant peptide, polypeptide, or proteinexpressed in said host cell are peptides, polypeptides, or proteins thatare both strongly retained and weakly retained during columnchromatography.55. A separatome of chromatographically relevant host cell peptides,polypeptides, and proteins for column affinity chromatography employingan affinity ligand bound to a solid phase or column adsorption-based,non-affinity chromatography, comprising host cell peptides,polypeptides, and proteins based on their capacity recovery potentialfrom said column,

wherein said capacity recovery potential of said host cell peptides,polypeptides, and proteins is quantitatively determined by:

(a) scoring a peptide, polypeptide, or protein (i) with the formulae:

${importance}_{i} = {\sum_{j}\lbrack {{b_{1}( \frac{y_{c_{j}}}{y_{\max}} )}( \frac{h_{i,_{j}}}{h_{i,{total}}} )( \frac{h_{i,_{j}}}{h_{j,{total}}} )( \frac{M\; W_{i}}{M\; W_{ref}} )^{\alpha}} \rbrack_{i}}$

with values for a series of peptides, polypeptides, and proteins writtenin descending order (largest value close to unity downwards to thesmallest value), followed by

(b) calculating the capacity recovery potential of a relevant peptide,polypeptide, or protein (i) given by:

recovery potential_(i) =h _(i,total) /h _(total,ms) ₅

wherein the following definitions apply: y_(cj) andy_(max)=concentration of mobile phase eluent in fraction (j) and maximumvalue, respectively; and h_(i,j) and h_(i,total)=the amount of protein(i) in fraction (j) and total bound protein (i), respectively;h_(j,total)=total amount of protein in fraction (j); h_(total,ms)=totalmass of protein bound to column; b₁=scaling parameter; α=steric factor;MW_(i) and MW_(ref)=molecular weight of protein (i) or referenceprotein, respectively.

56. The separatome of 55, wherein said affinity ligand in said columnaffinity chromatography employing an affinity ligand bound to a solidphase is selected from the group consisting of an amino acid, a divalentmetal ion, a carbohydrate, an organic dye, a coenzyme, glutathioneS-transferase, avidin, heparin, protein A, and protein G.57. The separatome of 56, wherein peptides, polypeptides, and proteinsare eluted from said affinity chromatography column using an elutionagent y selected from the group consisting of a common salt, hydroniumion, imidazole, glutathione, a chaotropic salt, heparin, and glycine.58. The separatome of 55, wherein said column adsorption-based,non-affinity chromatography is selected from the group consisting of ionexchange chromatography, reverse phase chromatography, and hydrophobicinteraction chromatography.59. The separatome of 58, wherein peptides, polypeptides, and proteinsare eluted from said adsorption-based, non-affinity chromatographycolumn using an elution agent y selected from the group consisting of acommon salt, hydronium ion, and an organic solvent.60. The separatome of 57 or 59, wherein said common salt is selectedfrom the group consisting of a chloride salt, a sulfate salt, an acetatesalt, a carbonate salt, and a propionate salt.61. The separatome of 59, wherein said organic solvent is selected fromthe group consisting of methanol, 2-propanol, and acetonitrile.62. The separatome of 57, wherein said chaotropic salt is guanidinehydrochloride.63. The separatome of any one of 55-62, wherein the maximum value ofsaid elution agent y is defined by y_(max) in 55.64. The separatome of any one of 55-63, which is in a form selected fromthe group consisting of a table, a visual representation such as afigure, and a computer file.65. The separatome of chromatographically relevant host cell peptides,polypeptides, or proteins for column affinity chromatography employingan affinity ligand bound to a solid phase of any one of 55-57, 60, or62-64.66. The separatome of chromatographically relevant host cell peptides,polypeptides, or proteins for column adsorption-based, non-affinitychromatography of any one of 55, 58-61, or 63-64.67. A method for designing a reduced or modified proteome host cell, ora host cell in which expression of genes is reduced or completelyinhibited compared to expression of said genes in the parent cell fromwhich said host cell is derived, for expression of a target recombinantpeptide, polypeptide, or protein to improve the chromatographicseparation efficiency of said target recombinant peptide, polypeptide,or protein expressed in said host cell, comprising identifying andranking proteins of chromatographic relevance that adversely affect saidseparation efficiency of said target recombinant peptide, polypeptide,or protein in a parent cell from which said host cell is derived by:

i) equilibrating an affinity chromatography column employing an affinityligand bound to a solid phase, or an adsorption-based, non-affinitychromatography column, using a mobile loading or eluting phase, or anoperational variable;

ii) in the case where said target recombinant peptide, polypeptide, orprotein is not secreted, fractionating a lysate of said host cell, or inthe case where said target recombinant peptide, polypeptide, or proteinis secreted from said host cell, fractionating the culture medium inwhich said host cell is grown, on said column by applying an elutiongradient to elute peptide, polypeptide, or protein fractions from saidcolumn;

iii) identifying, quantifying, and scoring peptides, polypeptides, orproteins in said fractions eluted from said column;

iv) assessing the metabolic role of said peptides, polypeptides, orprotein identified in step iii) that affect column capacity; and

v) designing a reduced or modified genome host cell, or a host cell inwhich expression of genes is reduced or completely inhibited compared toexpression of said genes in the parent cell from which said host cell isderived, to modify the proteome of said parent cell from which said hostcell is derived in order to increase chromatographic separationefficiency based on steps iii) and iv).

68. The method of 67, further comprising reducing or modifying thegenome of said parent cell from which said host cell is derived, orreducing or completely inhibiting the expression of peptides,polypeptides, or proteins in said parent cell, to increasechromatographic separation efficiency based on step v), therebyproducing a host cell comprising a reduced or modified genome comparedto the genome in said parent cell from which said host cell is derived,or a host cell in which expression of peptides, polypeptides, orproteins is reduced or completely inhibited.69. The method of 67 or 68, wherein said reduced or modified proteomehost cell, or said host cell host cell in which expression of (n) genesis reduced or completely inhibited compared to expression of said genesin the parent cell from which said host cell is derived. facilitates anoverall capacity recovery of said target recombinant peptide,polypeptide, or protein in the range of from about 5%, from about 10%,from about 20%, from about 30%, from about 40%, from about 50%, fromabout 60%, from about 70%, from about 80%, from about 90%, or from about95%, to about 100%, wherein capacity recovery is defined by summing (n)values of recovery potential for individual (i) proteins by thefollowing:

${{capacity}\mspace{14mu} {recovery}} = {100\% \mspace{14mu} x\mspace{11mu} {\sum\limits_{i = 1}^{n}{{recovery}\mspace{14mu} {potential}_{i}}}}$

wherein n=total number of proteins that are deleted, inhibited, ormodified, and i=an individual protein.

A preferred range for capacity recovery is from about 3% to about 50%,more preferably from about 5% to about 40%, or from about 5% to about35%.

70. The method of any one of 67-69, wherein step i) is modified byvarying the characteristics of said mobile loading or eluting phase oroperational variable.71. The method of any one of 67-70, wherein identification of saidpeptides, polypeptides, or proteins in step iii) is performed bycomparing the LC-MS signature of said peptides, polypeptides, orproteins to publicly available standards.72. The method of any one of 67-71, wherein quantification of saidproteins in step iii) is performed using spectral counting, or acombination of Bradford protein assay, 2-dimensional electrophoresis,and densitometry.73. The method of any one of 67-72, wherein said scoring in step iii) iscalculated as in 55.74. The method of any one of 67-73, wherein assessing the metabolic roleof identified proteins in step iv) is performed by bioinformaticstechniques.75. A method of enriching the amount of a target recombinant peptide,polypeptide, or protein relative to other peptides, polypeptides, orproteins present in an initial protein mixture comprising said targetrecombinant peptide, polypeptide, or protein, comprising:

i) selecting a chromatography medium that binds said target recombinantpeptide, polypeptide, or protein from the group consisting of anaffinity chromatography medium and an adsorption-based, non-affinitychromatography medium;

ii) in the case where an affinity chromatography medium is selected,expressing said target recombinant peptide, polypeptide, or protein insaid host cell of any one of 1-32, 34, 36, 38-44, 48, or 51-54;

iii) in the case where an adsorption-based, non-affinity chromatographymedium is selected, expressing said target recombinant peptide,polypeptide, or protein in said host cell of any one of 1-31, 33-35,37-42, or 45-54; and

iv) chromatographing said initial protein mixture comprising said targetrecombinant peptide, polypeptide, or protein using said chromatographymedium of step ii) or step iii), as appropriate, and collecting elutionfractions, thereby obtaining one or more fractions containing anenriched amount of said target recombinant peptide, polypeptide, orprotein relative to other peptides, polypeptides, or proteins in saidfraction compared to the amount of said target recombinant peptide,polypeptide, or protein relative to other peptides, polypeptides, orproteins in said initial protein mixture.

76. The method of 75, further comprising chromatographing an enrichedfraction of step iv) to obtain said target recombinant peptide,polypeptide, or protein in a desired degree of purity.77. The method of 76, further comprising recovering said targetrecombinant peptide, polypeptide, or protein.78. A method of preparing a pharmaceutical or veterinary compositioncomprising a recombinant therapeutic peptide, polypeptide, or protein,comprising:

i) selecting a chromatography medium that binds said recombinanttherapeutic peptide, polypeptide, or protein from the group consistingof an affinity chromatography medium and an adsorption-based,non-affinity chromatography medium;

ii) in the case where an affinity chromatography medium is selected,expressing said recombinant therapeutic peptide, polypeptide, or proteinin said host cell of any one of 1-32, 34, 36, 38-44, 48, or 51-54;

iii) in the case where an adsorption-based, non-affinity chromatographymedium is selected, expressing said recombinant therapeutic peptide,polypeptide, or protein in said host cell of any one of 1-31, 33-35,37-42, or 45-54;

iv) in the case where said recombinant therapeutic peptide, polypeptide,or protein is not secreted from said host cell, preparing a lysate ofsaid host cell containing said recombinant therapeutic peptide,polypeptide, or protein, producing an initial recombinant therapeuticpeptide-, polypeptide-, or protein-containing mixture; or

v) in the case where said recombinant therapeutic peptide, polypeptide,or protein is secreted from said host cell, harvesting culture medium inwhich said host cell is grown, containing said recombinant therapeuticpeptide, polypeptide, or protein, thereby obtaining an initialrecombinant therapeutic peptide-, polypeptide-, or protein-containingmixture;

vi) chromatographing said initial recombinant therapeutic peptide-,polypeptide-, or protein-containing mixture of step iv) or step v) usingsaid chromatography medium of step i) or step ii), as appropriate, andcollecting elution fractions, thereby obtaining one or more fractionscontaining an enriched amount of said recombinant therapeutic peptide,polypeptide, or protein relative to other peptides, polypeptides, orproteins in said fraction compared to the amount of said recombinanttherapeutic peptide, polypeptide, or protein relative to other peptides,polypeptides, or proteins in said initial protein mixture;

vii) further chromatographing an enriched fraction of step vi) to obtainsaid recombinant peptide, polypeptide, or protein in a desired degree ofpurity;

viii) recovering said recombinant therapeutic peptide, polypeptide, orprotein; and

ix) formulating said recombinant therapeutic peptide, polypeptide, orprotein with a pharmaceutically or veterinarily acceptable carrier,diluent, or excipient to produce a pharmaceutical or veterinarycomposition, respectively.

79. A method of purifying a recombinant enzyme, comprising:

i) selecting a chromatography medium that binds said recombinant enzymefrom the group consisting of an affinity chromatography medium and anadsorption-based, non-affinity chromatography medium;

ii) in the case where an affinity chromatography medium is selected,expressing said recombinant enzyme in said host cell of any one of 1-32,34, 36, 38-44, 48, or 51-54;

iii) in the case where an adsorption-based, non-affinity chromatographymedium is selected, expressing said recombinant enzyme in said host cellof any one of 1-31, 33-35, 37-42, or 45-54;

iv) in the case where said recombinant enzyme is not secreted from saidhost cell, preparing a lysate of said host cell containing saidrecombinant enzyme, producing an initial recombinant enzyme-containingmixture; or

v) in the case where said recombinant enzyme is secreted from said hostcell, harvesting culture medium in which said host cell is grown,containing said recombinant enzyme, thereby obtaining an initialrecombinant enzyme-containing mixture;

vi) chromatographing said initial recombinant enzyme-containing mixtureof step iv) or step v) using said chromatographic medium of step i) orstep ii), as appropriate, and collecting elution fractions, therebyobtaining one or more fractions containing an enriched amount of saidrecombinant enzyme relative to other peptides, polypeptides, or proteinsin said fraction compared to the amount of said recombinant enzymerelative to other peptides, polypeptides, or proteins in said initialprotein mixture;

vii) further chromatographing an enriched fraction of step vi) to obtainsaid recombinant enzyme in a desired degree of purity; and

viii) recovering purified recombinant enzyme.

80. The method of 79, further comprising placing said purifiedrecombinant enzyme in a buffer solution in which said purifiedrecombinant enzyme is stable and retains enzymatic activity.81. The method of 80, wherein said purified recombinantenzyme-containing buffer solution is reduced to dryness.82. The method of 81, wherein said dry purified recombinantenzyme-containing buffer solution is in the form of a powder.83. A kit, comprising said host cell of any one of 1-54 or 68-69.84. The kit of 83, further comprising instructions for expressing atarget recombinant peptide, polypeptide, or protein in said host cell.85. The kit of 84, wherein said target recombinant peptide, polypeptide,or protein is an endogenous or heterologous target recombinant peptide,polypeptide, or protein.86. The kit of any one of 83-85, wherein said instructions furthercomprise directions for purifying said expressed target recombinantpeptide, polypeptide, or protein by affinity chromatography oradsorption-based, non-affinity chromatography.87. The kit of any one of 83-86, further comprising a chromatographicresin for affinity chromatography or adsorption-based, non-affinitychromatography.88. A method of enriching a target peptide, polypeptide, or protein froma mixture obtained from a host cell, comprising:

a. chromatographing said mixture via affinity chromatography oradsorption-based, non-affinity chromatography;

b. collecting an elution fraction that contains an enriched amount ofsaid target peptide, polypeptide, or protein in said fraction comparedto the amount of said peptide, polypeptide, or protein of interest insaid mixture; and

c. recovering said target peptide, polypeptide, or protein from saidelution fraction,

-   -   wherein said host cell is derived from a parent cell, and has:    -   i) a reduced genome compared to the genome in the parent cell        from which it is derived, or    -   ii) a modified genome compared to the genome in the parent cell        from which it is derived, or    -   iii) in which expression of genes is reduced or completely        inhibited compared to expression of said genes in the parent        cell from which it is derived,    -   wherein genes that are deleted, modified, or the expression of        which is reduced or completely inhibited in said host cell, code        for peptides, polypeptides, or proteins that impair the        chromatographic separation efficiency of said target peptide,        polypeptide, or protein expressed in said host cell in said        affinity chromatography or said adsorption-based, non-affinity        chromatography.        89. The method of 88, wherein said mixture is a lysate of said        host cell in the case where said peptide, polypeptide, or        protein accumulates intracellularly, or is medium in which said        host cell is grown in the case where said peptide, polypeptide,        or protein is secreted by said host cell.        90. The method of 88 or 89, further comprising chromatographing        said target peptide, polypeptide, or protein of step c. in order        to obtain said target peptide, polypeptide, or protein in a        desired degree of purity.        91. The method of 90, further comprising recovering purified        target peptide, polypeptide, or protein.

The methods of 88-91 encompass the use of all of the parent cells, hostcells, and methods, etc., disclosed herein, and described in 1-87,above.

Further scope of the applicability of the present invention will becomeapparent from the detailed description and drawing(s) provided below.However, it should be understood that the detailed description andspecific examples, while indicating preferred embodiments of theinvention, are given by way of illustration only since various changesand modifications within the spirit and scope of the invention willbecome apparent to those skilled in the art from this detaileddescription.

BRIEF DESCRIPTION OF THE DRAWINGS

The above and other aspects, features, and advantages of the presentinvention will be better understood from the following detaileddescriptions taken in conjunction with the accompanying drawing(s), allof which are given by way of illustration only, and are not limitativeof the present invention, in which:

FIG. 1 shows a CIRCOS® rendering of model data used to describe multipleseparatomes. CIRCOS® is a software package that applies the circularideogram layout to display relationships between genomic intervals. Itis described in Krzywinski et al. (2009) “Circos: an InformationAesthetic for Comparative Genomics”, Genome Res. 19:1639-1645. In thefigure, the ring is comprised of segments that represent either genepositions or % B. Four different separatomes associated with popularmethods of chromatography (IMAC, immobilized metal affinitychromatography; AEX, anion exchange chromatography; CEX, cation exchangechromatography; and HIC, hydrophobic interaction chromatography) arerepresented. Connecting lines map individual proteins contained within aseparatome to their gene (located on the outer ring), with theconcentric (inner gray) ring describing the concentration of the proteinfound in the fractions as they elute from a column as indicated by thelength of the black bar segments. Other data that could be depicted in aCIRCOS® rendering include gene designation, essentiality of geneproduct, metabolic category, or other parameter, placed on a series ofconcentric rings or attached to the connecting lines, for example asshown by the other concentric ring fragment.

FIG. 2 shows a CIRCOS® rendering of model data describing the separatomeof E. coli for ion exchange chromatography. Similar to FIG. 1 is the useof connecting lines that indicate genes associated with proteins foundin the separatome of E. coli for a particular resin/equilibratingcondition. However, this rendering provides detail as to the elutionfraction by connecting a gene to a particular box on the ring thatrepresents a salt concentration. The lower black fragment of the circleentitled “Escherichia coli genome” can contain the location of genespresent on the E. coli chromosome. Each box represents a different cutfrom a column.

FIG. 3 shows the distribution of proteins contained within various IMACfractions that elute from a Ni(II) column. In particular, note the lowconcentration of host cell proteins within the 120 mM fraction.

FIG. 4 shows a Western blot (a) and protein get (b) that indicate lackof expression of gene products of yfbG, adhP, and cyoA. Lack ofexpression is indicated by absence of spot or band.

FIG. 5 shows removal of thyA prior to homologous recombination.

FIG. 6 shows removal of a gene targeted for deletion via a two stepprocess.

FIG. 1 relates to the detailed description of the invention. FIGS. 2, 5,and 6 relate to Example 2, Construction of the Ion Exchange Separatomeof E. coli and Its Use to Design and Build Novel Host Strains for aCommon Chromatography Resin. FIGS. 3 and 4 refer to Example 1,Identification of Host Cell Proteins Associated With a Specific Product,Histidine-Tagged Green Fluorescent Protein, as a Comparative Example.

DETAILED DESCRIPTION OF THE INVENTION

The following detailed description of the invention is provided to aidthose skilled in the art in practicing the present invention. Even so,the following detailed description should not be construed to undulylimit the present invention, as modifications and variations in theembodiments herein discussed may be made by those of ordinary skill inthe art without departing from the spirit or scope of the presentinventive discovery.

The contents of each of the references cited herein is hereinincorporated by reference in its entirety.

As noted above, Asenjo et al. (2004), “Is there a rational method topurify proteins? From expert systems to proteomics”, Journal ofMolecular Recognition 17:236-247, points out that, “Until now, it hasbeen virtually impossible to select separation and purificationoperations for proteins either for therapeutic or analytical applicationin a rational manner due to lack of fundamental knowledge on themolecular properties of the materials to be separated and the lack of anefficient system to organize such information.” The present inventionprovides solutions to this problem.

The inventions disclosed herein include a separatome-based proteinexpression and purification platform based on the juxtaposition of thechromatographic binding properties of genomic peptides, polypeptides,and proteins with the characteristic and location of genes on the targetchromosome, such as those of E. coli, Bacillus subtilis, yeasts, andother host cells. The separatome-based protein expression andpurification platform maps the separatome of target chromosomes based onrelationships between the loci of genes associated with nuisancepeptides, polypeptides, and proteins. In addition, the separatome-basedprotein expression and purification platform reduces the genome of hostcells through precisely targeted modifications to create custom, robusttarget host strains with reduced nuisance peptides, polypeptides, andproteins. Moreover, the present separatome-based protein expression andpurification platform provides a computerized knowledge tool that, givenseparatome data regarding a target peptide, polypeptide, or protein,intuitively suggests strategies leading to efficient purification. Theseparatome-based protein expression and purification platform is anefficient bioseparation system that intertwines host cell strain andchromatography.

Definitions

The following definitions are provided to aid the reader inunderstanding the various aspects of the present invention. Unlessdefined otherwise, all technical and scientific terms used herein havethe same meaning as commonly understood by those of ordinary skill inthe art to which the invention pertains.

“An affinity ligand” for affinity chromatography refers to a chemicalmoiety, coupled to a stationary phase, that serves as a biospecificsorptive group.

“Host cell” refers to a cell used to express an endogenous orheterologous nucleic acid sequence encoding a target peptide,polypeptide, or protein of interest.

“Parent cell from which it is derived” refers to a cell that is modifiedto serve as a host cell of the present invention. As a non-limitingexample, an E. coli parent cell can be a conventional E. coli K-12 cell.

The phrase “a target recombinant therapeutic peptide, polypeptide, orprotein” and the like refers to a peptide, polypeptide, or proteinexhibiting human or veterinary medicinal properties, expressed usingrecombinant nucleic acid methodology. As used herein, “medicinalproperties” broadly includes not only medical therapeutic applications,but use for nutritional purposes and personal care as well.

The phrase “endogenous target recombinant peptide, polypeptide orprotein” and the like refers to a peptide, polypeptide, or proteinnative to a host cell, which is expressed in such cell using recombinantnucleic acid methodology.

It should be noted that the present invention, including all the parentcells, host cells, methods, etc., disclosed herein, are applicable notonly to the expression and purification of endogenous peptides,polypeptides, or proteins via recombinant methods, but also to theexpression and purification of endogenous peptides, polypeptides, andproteins that are naturally expressed within host cells, i.e., withoutthe application of recombinant methodology.

The phrase “heterologous target recombinant peptide, polypeptide orprotein” and the like refers to a peptide, polypeptide, or protein notnative to a host cell, which is expressed in such cell using recombinantnucleic acid methodology.

The phrase “a modified genome compared to the genome in the parent cellfrom which it is derived” refers to modification of genes to abate theundesirable effect(s) of the gene products on separation efficiencyperformed by, for example, point mutation, isozyme substitution,transposon mutagenesis, etc. As indicated, modification includes genesubstitution.

“Proteome” refers to a collection of identifiable proteins expressed bya host cell.

“Chromatotome” refers to a proteome defined by a set of host cellproteins that bind a chromatographic stationary phase.

“Separatome” refers to a proteome defined by a set of host cell proteinsthat are associated with a separation technique (not limited to packedbed chromatography).

“Metalloproteome” refers to a proteome with the identifyingcharacteristic of interaction with metals or metal ions.

“Metabolome” refers to a collection of small-molecule metabolites likeglucose-6-phosphate and other molecules of similar molecular weight.

“Separation efficiency” is manifested as separation capacity, separationselectivity, or both. In many cases, separation capacity is a moreimportant parameter for the practice of the present invention.

“Separation capacity” refers to the amount of peptides, polypeptides,and/or proteins that can be captured during the loading cycle of achromatographic separation. Separation capacity is defined as the amountof target recombinant peptide, polypeptide, or protein adsorbed by acolumn per mass lysate fed to the column. The present inventionencompasses increases in separation capacity in the range of from about5% to about 35% or more. Such increases reflect an advantage of thepresent separatome invention concept over the separation capacitiesachievable using standard extraction and purification methods.

“Separation selectivity” refers to the amount of target protein/totalprotein captured by a chromatographic adsorbent. Separation selectivityis defined as the amount of target recombinant peptide, polypeptide, orprotein adsorbed by the column per total protein adsorbed to the column.

“Percent B” refers to a proportion or amount, expressed as a numberbetween 0 and 100%, of a mixture fed to a chromatography columncomprised of a blend of two fluids of different compositions, i.e.,composition A and composition B. As % B increases, the change in mobilephase composition causes proteins to be eluted in a differentialfashion, beginning with those of low affinity.

“Strongly retained” refers to peptides, polypeptides, and proteins thatelute from a chromatography column upon desorption due to stringentchanges in mobile phase composition identified by “percent B”.

“Weakly retained” refers to peptides, polypeptides, and proteins thatelute from a chromatography column upon desorption due to small changesin mobile phase composition identified by “percent B”.

“Common salt” refers to a compound that dissociates in water to form acation and an anion, such as a chloride salt, a sulfate salt, an acetatesalt, a carbonate salt, a propionate salt, etc., as would be apparent toone of ordinary skill in the art. Common cations in such salts are, forexample, sodium, potassium, and ammonium cations.

The phrase “chromatographically relevant host cell peptides,polypeptides, or proteins for column affinity chromatography or columnadsorption-based, non-affinity chromatography” refers to proteins of aseparatome or chromatotome.

“Importance” refers to the degree to which, should a host cell peptide,polypeptide, or protein be deleted, modified, or inhibited, capacityrecovery is impacted. Proteins of chromatographic relevance areconsidered important should large gains in capacity recovery be achievedthrough deletion, modification, or inhibition. “Important” proteins aretherefore a subset of relevant proteins.

“Reduced” in the context of the level of expression of peptides,polypeptides, or proteins from host cell genes refers to diminution inthe amount of such expression products in the range of from about 5% toabout 95%, more preferably from about 10% to about 95%, or more,compared to the level of such products normally observed in parent cellsfrom which such host cells are derived.

“Scoring” refers to rank ordering members of a separatome to identifyhost cell peptides, polypeptides, or proteins that impair thechromatographic separation efficiency of a target recombinant peptide,polypeptide, or protein expressed in the host cell, and to establishquantitative improvements gained through their elimination.

“Operational variable” refers to a condition or operating parameter thatleads to different Damkohler, Biot, or Peclet numbers used to describe aseparation technique.

“Purify, purifying, purified” and the like refer to the process by whicha peptide, polypeptide, or protein in a mixture is enriched so as tocontain lesser amounts of materials derived from the host cell in whichit is expressed, and the enriched product, respectively.

“Plant cells” includes cells of flowering and non-flowering plants, aswell as algal cells, for example Chlamydomonas, Chlorella, etc.

Certain claims have unique formulae to mathematically define thenon-metabolic aspects of the separatome, with specific regard to theoverall impact a peptide, polypeptide, or protein has on columnefficiency. A peptide, polypeptide, or protein elutes or emerges from acolumn as a peak of material, first at low concentration increasing to amaximum value, then decreasing back to zero, in the characteristic shapeof a bell-like curve. The peak adopts a shape that may be described assharp/narrow, with the majority of material of interest contained in afew fractions; broad/shallow, with the majority of material present inmultiple fractions; or something in between. The time (retention time)at which the peak emerges is governed by binding strength. Peptides,polypeptides, and proteins with high affinity towards a ligand requiremore stringent conditions for desorption to occur, whereas those withlow affinity pass through the column unretained. The ability to captureboth phenomena, namely peak shape and retention time, is important toquantitatively establish the chromatographic relevance of a peptide,polypeptide, or protein. Once the relevance for a set of peptides,polypeptides, or proteins is established, molecular biology techniquesare then used to delete, modify, inhibit the expression of, orsubstitute genes associated with these interfering molecules to directlyincrease column capacity and indirectly enhance purity.

Defining “recovery potential” for protein (i) first involves determiningthe fractional capacity occupied by this particular host cell proteinby:

recovery potential_(i) =h _(i,total) /h _(total,ms)

with h_(total,ms)=total amount of host cell proteins bound to column,and h_(i,total)=the bound amount attributed to (i). The value ofrecovery potential is bound by zero and one, with a value of oneindicative of a single host cell protein, if removed from theseparatome, would achieve complete recovery of the column capacity.Extending this argument to the removal of (n) proteins, the term“capacity recovery” is defined in general as:

${{capacity}\mspace{14mu} {recovery}} = {100\% \mspace{14mu} x\mspace{11mu} {\sum\limits_{i = 1}^{n}{{recovery}\mspace{14mu} {potential}_{i}}}}$

where the sigma operator allows one to sum the individual contributionsfor the set of (n) proteins. In the equation, n refers to number ofproteins, and i is an individual protein.

These two simple relationships provide the starting point to define howmuch capacity can be gained as genes are deleted, modified, inhibited,or substituted. The relationships do not, however, establish order orpriority within the context of peak shape and retention time. The latteris important to the disclosed invention because as mentioned previously,it is desired to focus efforts on common, problematic host cell proteinsrather than those that are specific to a target recombinant product.Strongly retained or high affinity host cell proteins that are bound andthat subsequently reduce column capacity would be generally problematicdue to their persistent presence. Other qualifiers generally regarded asproblematic would include high molecular weight (steric effects at highloading), sensitivity to proteolysis (multiple peaks or broad peak for asingle protein), and propensity for subunit adsorption (multiple peaksor broad peak for a single protein). A criterion has been developed toscore the “importance” of a protein (i) within a separatome, namely:

${importance}_{i} = {\sum_{j}\lbrack {{b_{1}( \frac{y_{c_{j}}}{y_{\max}} )}( \frac{h_{i,_{j}}}{h_{i,{total}}} )( \frac{h_{i,_{j}}}{h_{j,{total}}} )( \frac{M\; W_{i}}{M\; W_{ref}} )^{\alpha}} \rbrack_{i}}$

with the following definitions: b₁=scaling parameter, y_(cj) andy_(max)=concentration of mobile phase eluent in fraction (j) and maximumvalue, respectively; h_(i,j) and h_(i,total)=the amount of protein (i)in fraction (j) and total bound protein (i), respectively;h_(j,total)=total amount of protein in fraction (j); MW_(i)=molecularweight of protein (i); MW_(ref)=molecular weight of a reference proteinwithin the separatome; a=steric factor; and i=protein. These ratioterms—the y's and h's—adopt values between 0 and 1, yet hold differentsignificances. A protein that remains bound and requires stringentconditions for elution reflects a y ratio to be close to, if not equalto, unity. A protein that emerges as a tight peak presents with a ratiofor h close to unity, and finally, should that emerging peak constitutethe majority of fraction (j), the third ratio would be close to unity.Multiplying each ratio, and summing the product of these ratios for eachfraction (j) where (i) is present provides a quantitative ranking. Forexample, a protein that is retained at high NaCl concentration andemerges as a sharp peak would be deemed chromatographically relevant andwill be scored as high with this formula. A second example would be aprotein that broadly elutes. It would also receive a high score orrelevancy because its score would be high by virtue of its presence inmultiple fractions.

Lastly, there requires a consideration of steric effects. As achromatography column becomes loaded, larger proteins interact withmultiple ligands either directly through adsorption, or indirectlythrough hindrance of binding. When steric effects require consideration,the basic equation contains a molecular weight ratio raised to a powerthat is descriptive of these phenomena. A non-zero alpha in the aboveequation, with a preferred value between 0 and 1, would indicate somedegree of steric effects. Note that the general form of the importanceequation also permits scale-parameters (b₁) to adjust the weighting of aparticular score. For example, b₁ may be used to indicate metabolicnecessity (b₁=0), meaning a zero value will force a low score because itlikely will not be deleted form the genome.

To summarize, the basic form of the equation favors the elimination ordeletion of peptides, polypeptides, or proteins that have high affinitytoward the adsorbent, with some degree of freedom to permit thetailoring of the modifications should the host cell be expressly usedfor a single recombinant DNA product and not a variety of products.

Commercially Important Protein Products

Exemplary, non-limiting, commercially important peptide, polypeptide,and protein products that can be expressed, recovered, and purifiedusing the host cells, methods, and separatome information disclosedherein include, but are not limited to, the following.

Therapeutic Proteins

Examples of therapeutic human proteins that have been synthesized fromgenes cloned in bacteria and/or eukaryotic cells, or by expression inplants or animals, include antibodies and antigen-binding fragments;vaccines; α₁-Antitrypsin (emphysema); deoxyribonuclease (cysticfibrosis); epidermal growth factor (ulcers); erythropoietin (anemia);Factor VIII (hemophilia); Factor IX (Christmas disease); fibroblastgrowth factor (ulcers); follicle stimulating hormone (infertilitytreatment); granulocyte colony stimulating factor (cancers); insulin(diabetes); insulin-like growth factor I (growth disorders);interferon-α (leukemia and other cancers); interferon-β (cancers, AIDS);interferon-γ (cancers, rheumatoid arthritis); interleukins (cancers,immune disorders); lung surfactant protein (respiratory distress);relaxin (aid in childbirth); serum albumin (plasma supplement);somatostatin (growth disorders); somatotropin (growth disorders);superoxide dismutase (free radical damage in kidney transplants); tissueplasminogen activator (heart attack); tumor necrosis factor (cancers).

Proteins and Enzymes Used in Analytical Applications

In addition to the use of antibodies and enzymes as therapeutic agents,they are also used in the diagnosis of diseases as the components ofsome confirmatory tests of certain diagnostic procedures. Hexokinase andglucose oxidase are used in the quantification of glucose in the serumand urine. Glucose-oxidase is used in glucose electrodes. Uricase isused for the estimation of uric acid present in urine. Alkalinephosphatase, horseradish peroxidase, and antibodies are used in ELISA(Enzyme Linked Immunosorbent Assay).

Industrial Enzymes and Proteins

Industrially useful enzymes include carbohydrate-hydrolyzing enzymessuch as amylases, cellulase, invertases, etc.; proteolytic enzymes suchas papain, trypsin, chymotrypsin, etc.; and other bacterial andfungal-derived proteolytic enzymes and lipases that can hydrolyzevarious types of lipids and fats. All these enzymes are important in thefood and beverage industries, the textile industry, paper industry, anddetergent industry. Proteases have a special use in the beverageindustry, meat and leather industries, cheese production, detergentindustry, bread, and confectionary industry. Various types of lipasesare used for the modifications of various types of lipids and fats,production of various organic acids including fatty acids, indetergents, and production of cocoa butter. In addition to all these,enzymes are used in chemical industries as reagents in organic synthesisfor carrying out stereospecific reactions.

Non-Catalytic Functional Proteins

These commercially important proteins are used in the food industry asemulsifiers, for inducing gelation, water binding, foaming, whipping,etc. These non-catalytic functional proteins are classified as wheyproteins. The proteins that remain in solution after the removal ofcasein are by definition called whey proteins.

Commercially available whey protein concentrates contain 35% to 95%protein. If they are added to food on a solid's basis, there will belarge differences in functionality due to the differences in proteincontent. Most food formulations call for a certain protein content andthus whey-protein concentrates are generally utilized as a constantprotein base. In this case, the differences due to protein content assuch should be eliminated. As the protein content increases, thecomposition of other components in the whey-protein concentrate mustalso change and these changes in composition have an effect onfunctionality.

Nutraceutical Proteins

Nutraceutical proteins represent a class of nutritionally-importantproteins having therapeutic activity. The whey-protein concentrates andsome of the milk proteins of infant foods contain certain pharmaceuticalproteins having high nutritive quality. Infants get the requiredproteins from the mother's milk, which also contains certain therapeuticproteins that protect the baby from infection and other problems. Thereare other infant foods, which also have more or less the samecomposition as that of mother's milk, made up of cow's and buffalo'smilk. All these food proteins provide the infants the raw buildingmaterials in the form of essential amino acids and at the same timeprotect them from microbial infections and other diseases.

Large Scale Enzyme Applications

Detergents

Bacterial proteinases are still the most important detergent enzymes.Lipases decompose fats into more water-soluble compounds. Amylases areused in detergents to remove starch based stains.

Starch Hydrolysis and Fructose Production

The use of starch degrading enzymes was the first large scaleapplication of microbial enzymes in food industry. Mainly two enzymescarry out conversion of starch to glucose: alpha-amylase and fungalenzymes. Fructose is produced from sucrose as a starting material.Sucrose is split by invertase into glucose and fructose, and fructose isseparated and crystallized.

Beverages

Enzymes have many applications in the beverage industry. Lactase splitsmilk-sugar lactose into glucose and galactose. This process is used formilk products that are consumed by lactose intolerant consumers.Addition of pectinase, xylanase, and cellulase improve the liberation ofthe juice from pulp. Similarly, enzymes are widely used in wineproduction.

Textiles

The use of enzymes in the textile industry is one of the most rapidlygrowing fields in industrial enzymology. The enzymes used in the textilefield are amylases, catalase, and lactases, which are used to removestarch, degrade excess hydrogen peroxide, bleach textiles, and degradelignin.

Animal Feed

Addition of xylanase to wheat-based broiler feed has increased theavailable metabolizable energy 7-10% in various studies. Enzyme additionreduces viscosity, which increases absorption of nutrients, liberatesnutrients either by hydrolysis of non-degradable fibers or by liberatingnutrients blocked by these fibers, and reduces the amount of feces.

Baking

Alpha-amylases have been most widely studied in connection with improvedbread quality and increased shelf life. Use of xylanases decreases thewater absorption, and thus reduces the amount of added water needed, inbaking. This leads to more stable dough. Proteinases can be added toimprove dough-handling properties; glucose oxidase has been used toreplace chemical oxidants and lipases to strengthen gluten, which leadsto more stable dough and better bread quality.

Pulp and Paper

The major application in the pulp and paper industry is the use ofxylanases in pulp bleaching. This considerably reduces the need forchlorine based bleaching chemicals. In paper making, amylase enzymes areused especially in modification of starch. Pitch is a sticky substancepresent mainly in softwoods. Pitch causes problems in paper machines andcan be removed by lipases.

Leather

The leather industry uses proteolytic and lipolytic enzymes in leatherprocessing. Enzymes are used to remove unwanted parts. In dehairing anddewooling phases, bacterial proteases are used to assist the alkalinechemical process. This results in a more environmentally friendlyprocess and improves the quality of the leather. Bacterial and fungalenzymes are used to make leather soft and easier to dye.

Specialty Enzymes

There are a large number of specialty applications for enzymes. Theseinclude the use of enzymes in analytical applications, flavorproduction, protein modification, personal care products,DNA-technology, and in fine chemical production.

Enzymes in Analytics

Enzymes are widely used in clinical analytical methodology. Contrary tobulk industrial enzymes, these enzymes need to be free from sideactivities. This means that elaborate purification processes are needed.

An important development in analytical chemistry is biosensors. The mostwidely used application is a glucose biosensor involving glucose oxidasecatalyzed reaction. Several commercial instruments are available whichapply this principle for measurement of molecules like glucose, lactate,lactose, sucrose, ethanol, methanol, cholesterol, and some amino acids.

Enzymes in Personal Care Products

Personal care products are a relatively new area for enzymes. Proteinaseand lipase containing enzyme solutions are used for contact lenscleaning. Hydrogen peroxide is used in disinfections of contact lenses.The residual hydrogen peroxide after disinfections can be removed bycatalase. Some toothpaste contains glucoamylase and glucose oxidase.Enzymes are also being studied for applications in skin and hair careproducts.

Enzymes Used in DNA-Technology

DNA-technology is an important tool in the enzyme industry. Mosttraditional enzymes are produced by organisms that have been geneticallymodified to overproduce desired enzymes. Recombinant DNA methodology hasbeen used to engineer overproducing microorganisms, and employs enzymessuch as nucleases (especially restriction endonucleases), ligases,polymerases, and DNA-modifying enzymes to modify genes and constructnecessary expression cassettes and vectors.

Enzymes in Fine Chemical Production

In spite of some successes, commercial production of chemicals by livingcells via pathway engineering is still in many cases the bestalternative to apply biocatalysis. Isolated enzymes have, however, beensuccessfully used in fine chemical synthesis. Some of the most importantexamples are:

Chirally Pure Amino Acids and Aspartame

Natural amino acids are usually produced by microbial fermentation.Novel enzymatic resolution methods have been developed for theproduction of L- and D-amino acids. Aspartame, the intensive non-caloriesweetener, is synthesized in non-aqueous conditions by thermolysin, aproteolytic enzyme.

Rare Sugars

Recently, enzymatic methods have been developed to manufacturepractically all D- and L- forms of simple sugars. Glucose isomerase isone of the important industrial enzymes used in fructose manufacturing.

Semisynthetic Penicillins

Penicillin is produced by genetically modified strains of Penicilliumstrains. Most of the penicillin is converted by immobilized acylascs to6-aminopenicillanic acid, which serves as a backbone for manysemisynthetic penicillins.

Lipase-Based Reactions

In addition to detergent applications, lipases can be used in versatilechemical reactions since they are active in organic solvents. Lipasesare used in transesterification, for enantiomeric separation ofalcohols, and for the separation of racemic mixtures. Lipases have alsobeen used to form aromatic and aliphatic polymers.

Enzymatic Oligosaccharide Synthesis

The chemical synthesis of oligosaccharides is a complicated multi-stepeffort. Biocatalytic syntheses with isolated enzymes likeglycosyltransferases and glycosidases or engineered whole cells arepowerful alternatives to chemical methods. Oligosaccharides have foundapplications in cosmetics, medicines and as functional foods.

Overview of the Invention

The present invention provides a separatome-based host cell peptide,polypeptide, and protein expression and purification platform focusingon the proleomes of various chromatographic methods to provide a singlehost cell line, or set of host cell lines, that can be used forexpression of a wide variety of recombinant peptides, polypeptides, andproteins, thereby eliminating the need to develop individual host celllines for each purification process.

The “separatome” of the present separatome-based protein expression andpurification platform involves the juxtaposition of the bindingproperties of host cell peptides, polypeptides, and proteins in commonchromatographic techniques (e.g., IMAC, IEX, and/or HIC) with thecharacteristics and location of the corresponding encoding genes on thetarget host cell chromosome(s). While the examples of theseparatome-based protein expression and putification platform disclosedherein focus on Escherichia coli as the host cell, and its chromatotome,the invention is not limited thereto as the separatome-based peptide,polypeptide, and protein expression and purification platform can extendto any suitable host conventionally used for recombinant expression,such as Lactococcus lactis, Bacillus species such as B. licheniformis,B. amyloliquefaciens, and B. subtilis, Corynebacterium glutamicum,Pseudomonas fluorescens, or other prokaryotes; fungi, including variousyeasts such as Saccharomyces cerevisiae, Pichia (now K.) pastoris, andPichia methanolica; insect cells; mammalian cells; plant cells,including for example, tobacco (e.g., cultivars BY-2 and NT-1), alfalfa,rice, tomato, soy-bean, as well as algal cells; and protozoal cells suchas the non-pathogenic strain of Leishmania tarentolae, etc.

The present separatome-based peptide, polypeptide, and proteinexpression and purification platform is an efficient bioseparationsystem that intertwines host cell strain and chromatography. Since thehigh cost of product purification often limits the availability oftherapeutic proteins of interest to immunology, vaccine development,pharmaceutical production, and diagnostic reagents, as well as theavailability of enzymes for various applications, the presentseparatome-based peptide, polypeptide, and protein expression andpurification platform provides alternative pathways towards efficientpurification based on the utilization of proteome data. In particular,the separatome-based protein expression and purification platformprovides for: (i) a system of chromatographic data based on identified,conserved genomic regions that span resin- and gradient-specificchromatographies, or chromatotomes, for example, a database of E. coliproteins that span the chromatography total contaminant pool(TCP)/elution contaminant pool (ECP) and bind under various conditionsto a variety of chromatographic resins; (ii) a process to minimizecontaminant pools of nuisance or coeluting proteins associated withspecific chromatographies, for example, gradients that substantiallydecrease the number of coeluting proteins encountered duringbioseparation, and the specific, targeted deletion of nuisance host cellpeptide-, polypeptide-, and protein-encoding genes to minimizecontaminant pools associated with affinity adsorption and non-affinityadsorption chromatographies, including IMAC, cation IEX, anion IEX, HIC,and combinations thereof.

The separatome-based peptide, polypeptide, and protein expression andpurification platform is constructed based upon a computer system ofidentified, conserved genomic regions that span resin- and gradientspecific-chromatographies, or chromatotomes. The computer systemincludes a data visualization program/application resident on a standardcomputer device, such as a mainframe, desktop, or other computer. Forexample, the computer may have a central processor that controls theoverall operation of the computer and a system bus that connects thecentral processor to one or more conventional components, such as anetwork card or modem. The computer may also include a variety ofinterface ports and drives for reading and writing data or files. A userof the separatome-based protein expression and purification platform caninteract with the computer with a keyboard, pointing device, microphone,pen device, or other input device. The computer may be connected via asuitable network connection, such as a T1 line, a common local areanetwork (“LAN”), via the worldwide web, or via other mechanism forconnecting computer devices.

The separatome-based peptide, polypeptide, and protein expression andpurification platform will utilize large amounts of data compiled on themetalloproteome and mctabolome of the selected host cell, such as E.coli. The data visualization program/application, such as CIRCOS®, asoftware package for visualizing data and information in a circularlayout (available from Canada's Michael Smith Genome Sciences Center),enables the user to visualize the large amounts of data and informationfor exploring relationships between objects or positions. FIGS. 1 and 2illustrates examples of how the data visualization program/applicationcould illustrate the E. coli chromosome mapped with the chromatotome ofmultiple chromatographic techniques, thus showing where the differentchromatotomes lie within the greater genome. Each line in FIG. 1represents a single contaminating protein, and the graph at its baseshows the total concentration of the protein as a percent of the TCP orECP. If each TCP is subdivided into its respective ECPs, then furthercorollaries can be drawn between proteins and genomic location. Further,segments of the ring represent the E. coli genome or the proteomeassociated with a particular isolation technique. With respect to E.coli, inner rings can represent additional information likeessentiality, successful deletion, metabolic function, etc. For a givenchromatographic technique, inner ring data can represent conditions thattrigger adsorption or elution, concentration in the extract, and if thisprotein is differentially expressed during stress.

In addition, the separatome-based peptide, polypeptide, and proteinexpression and purification platform may utilize and/or incorporate dataabout the target genome and proteome sequences, such as from ECOGENE®(Institute for Advanced Biosciences, Keio University and IntegratedGenomics, Chicago, Ill.), a database and website that reports thestructural and functional annotation of Escherichia coli K-12 describedin Zhou et al. (2013) Nucleic Acids Research 41, Database issue, (D1):D613-D624. doi: 10.1093 /nar/gks1235. The data visualizationprogram/application of the separatome-based protein expression andpurification platform provides the user a feasible means of utilizingthe data by melding it into a productive format, and in particular, thedata visualization program/application provides the ability to visuallysummarize large collections of data covering peptides, polypeptides, andproteins encountered in the chromatotome and their essentiality.

The mapping and plotting of the IMAC, IEX and HIC data by theseparatome-based peptide, polypeptide, and protein expression andpurification platform allows for the identification of large contiguousregions of contaminants from several chromatography techniques that maybe targeted for modification if necessary.

Since the overall structure of a target recombinant peptide,polypeptide, or protein and the column resin are usually fixedconstraints, a reduction in contaminant species has the ability toimprove chromatographic recovery and purification via elimination ofundesirable binding events. Overall reduction of contaminant species,including undesired host cell peptides, polypeptides, and proteins, canbe achieved by removal, modification, or inhibition of the expression ofthe genomic regions coding for the contaminants.

General Methods

Practice of the present invention employs, unless otherwise indicated,conventional techniques of molecular biology, recombinant DNAtechnology, microbiology, chemistry, etc., which are well known in theart and within the capabilities of those of ordinary skill in the art.Such techniques include the following non-limiting examples: preparationof cellular, plasmid, and bacteriophage DNA; manipulation of purifiedDNA using nucleases, ligases, polymerases, and DNA-modifying enzymes;introduction of DNA into living cells; cloning vectors for variousorganisms; PCR; gene deletion, modification, replacement, or inhibition;production of recombinant peptides, polypeptides, and proteins in hostcells; chromatographic methods; etc.

Such methods are well known in the art and are described, for example,in Green and Sambrook (2012) Molecular Cloning: A Laboratory Manual,Fourth Edition, Cold Spring Harbor Laboratory Press; Ausubel et at.(2003 and periodic supplements) Current Protocols in Molecular Biology,John Wiley & Sons, New York, N.Y.; Amberg et al. (2005) Methods in YeastGenetics: A Cold Spring Harbor Laboratory Course Manual, 2005 Edition,Cold Spring Harbor Laboratory Press; Roe et al. (1996) DNA Isolation andSequencing: Essential Techniques, John Wiley & Sons; J. M. Polak andJames O'D. McGee (1990) In Situ Hybridization: Principles and Practice;Oxford University Press; M. J. Gait (Editor) (1984) OligonucleotideSynthesis: A Practical Approach, IRL Press; D. M. J. Lilley and J. E.Dahlberg (1992) Methods in Enzymology: DNA Structure Part A: Synthesisand Physical Analysis of DNA, Academic Press; and Lab Ref: A Handbook ofRecipes, Reagents, and Other Reference Tools for Use at the Bench,Edited by Jane Roskams and Linda Rodgers (2002) Cold Spring HarborLaboratory Press; Burgess and Deutscher (2009) Guide to ProteinPurification, Second Edition (Methods in Enzymology, Vol. 463), AcademicPress. Note also U.S. Pat. Nos. 8,178,339; 8,119,365; 8,043,842;8,039,243; 7,303,906; 6,989,265; U.S. 20120219994A1; and EP1483367B1.The entire contents of each of these texts and patent documents isherein incorporated by reference.

Methods for Deleting, Modifying, and Inhibiting the Expression of Genes

Baba et al. (2006) Mol. Syst. Biol. 2:2006.0008 discloses methods formaking precisely defined single gene deletions in E. coli.

Datsenko et al. (2000) Proc. Natl. Acad. Sci. USA. 97(12):6640-5discloses methods for inactivating chromosomal genes in E. coli usingPCR products.

Stringer et al. (2012) PLoS ONE 7(9): e44841.doi:10.1371/journal.pone.0044841 discloses a rapid, efficient, PCR-basedrecombineering method that can be used to introduce scar-free pointmutations, deletions, epitope tags, and promoters into the genomes ofmultiple species of enteric bacteria.

Methods for RNA silencing and antisense oligonucleotide inhibition ofgene expression are well known in the art. Note, for example, thereviews in Nature (2009) 457, No. 7228, pp. 395-433 and Molecular CancerTherapeutics (2002) 1:347-355, respectively.

Frequently Used Expression Systems for Foreign Genes

Yin et al. (2007) Journal of Biotechnology 127(3):335-347 reviews themost frequently used expression systems for foreign genes.

Listed below are a number of representative papers describing proteinproduction in frequently used host cell systems.

Expression in E. coli

Baneyx (1999) Curr. Opin. Biotechnol. 10(5): 411-21.

Expression in Bacillus Species

Bacillus species produce and secrete a large number of useful proteinsand metabolites (Zukowski (1992) “Production of commercially valuableproducts,” In: Doi and McGlouglin (eds.) Biology of Bacilli:Applications to Industry, Butterworth-Heinemann, Stoneham, Mass., pp.311-337). The most common Bacillus species used in industry are B.licheniformis, B. amyloliquefaciens, and B. subtilis. Because of theirGRAS (generally recognized as safe) status, strains of these Bacillusspecies are natural candidates for the production of proteins utilizedin the food and pharmaceutical industries. Important production enzymesinclude α-amylases, neutral proteases, and alkaline (or serine)proteases.

Published U.S. application 2012/0183995 discloses methods andcompositions for the improved expression and/or secretion of proteins ofinterest in Bacillus.

Expression in Corynebacterium glutamicum

Date et al. (2006) Lett Appl Microbiol. 42(1): 66-70.

Expression in Pseudomonas fluorescents

Retallack (2011) Protein Expression and Purification 81:157-65.

Expression in Eukaryotes

Yeasts and Other Fungi

Cregg et al. (2000) Mol. Biotechnol. 16(1): 23-52.

Malys et al. (2011) Methods Enzymol. 500:197-212.

Expression in Plant Cells

Hellwig et al. (2004) Nature Biotechnology 22(11):1415-1422comprehensively reviews the field of plant cell cultures for theproduction of recombinant proteins. The authors note that suspensioncell cultures have been prepared from several different plant species,including Arabidopsis thaliana, Taxus cuspidata, Catharanthus roseus,and domestic crops such as tobacco, alfalfa, rice, tomato, and soybean.They also point out that some researchers focus on plants with a highprotein content, for example soybean and lupin, assuming dial thesemight more readily facilitate higher protein expression levels.

Hempel et al. ((2011) “Algae as Protein Factories: Expression of a HumanAntibody and the Respective Antigen in the Diatom Phaeodactylumtricornutum”, PloS ONE 6(12): e28424. doi:10.1371/journal.pone.0028424)provides an example of the use of an algal cell, Phaeodactylumtricornutum, to produce a monoclonal human IgG antibody against theHepatitis B surface protein and the respective antigen. This referencefurther discusses the potential of algae as an efficient low cost andCO₂-neutral expression system, exhibiting fast growth rates without therisk of human pathogenic contaminations. The authors note currentinvestigations on the use of the green alga Chlamydomonas reinhardtii asan expression system.

Expression in Insect and Mammalian Cells

Baculovirus-Infected Insect Cells

Baculovirus-infected insect cells such as the Sf9, Sf21, and Hi-5strains can be used to express large quantities of glycosylated proteinsthat cannot be expressed using E. coli or yeasts. Genes are notexpressed continuously because infected host cells eventually lyse andthe during each infection cycle (Yin et al. (2007) Journal ofBiotechnology 127:335-347).

Non-Lytic Insect Cell Expression

Non-lytic insect cell expression is an alternative to the lyticbaculovirus expression system. In non-lytic expression, vectors aretransiently or stably transfected into the chromosomal DNA of insectcells for subsequent protein expression (Dyring (2011) BioprocessingJournal 10 (2011) 28-35; Olczak and Olczak (2006) AnalyticalBiochemistry 359 (2006) 45-53). This is followed by selection andscreening of recombinant clones (McCarroll and King (1997) CurrentOpinions in Biotechnology 8:590-594). The non-lytic system has been usedto give higher protein yield and quicker expression of recombinantproteins compared to baculovirus-infected cell expression (Olczak,supra). Cell lines used for this system include Sf9, Sf21 fromSpodoptera frugiperda cells, Hi-5 from Trichoplusia ni cells, andSchneider 2 cells and Schneider 3 cells from Drosophila melanogastercelts (Dyring, supra; McCarroll and King, supra). With this system,cells do not lyse, and several cultivation modes can be used (Dyring,supra). Additionally, protein production runs are reproducible (Dyring,supra; Olczak and Olczak, supra). This system yields a homogeneousproduct.

Kost et al. (2005) Nat. Biotechnol. 23(5):567-75.

Rosser et al. (2005) Protein Expr. Purif. 40(2):237-43.

Lackner et al. (2008) Anal. Biochem. 380(1):146-8.

Expression in Animal Cells

Currently, about 60-70% of all recombinant protein pharmaceuticals areproduced in mammalian cells, and it is estimated that several hundredclinical candidate proteins are under development. Many of these areexpressed in immortalized Chinese hamster ovary (CHO) cells, while othercell lines, such as those derived from mouse myeloma (NS0), baby hamsterkidney (BHK), human embryo kidney (HEK-293), and human retinal cellshave gained regulatory approval for recombinant protein production (F.M. Wurm (2004) Nature Biotechnology 22(11):1393-1398). This referencediscusses various aspects of recombinant protein expression in mammaliancells, including the design of expression vectors for DNA delivery, andcell culture.

Expression in Protozoal Cells

The eukaryotic protozoan Leishmania tarentolae (non-pathogenic strain)expression system, available from Jena Bioscience as the LEXSY(Leishmania expression system) system, allows stable and lastingproduction of proteins at high yield, in chemically defined media.Produced proteins exhibit fully eukaryotic post-translationalmodifications, including glycosylation and disulfide bond formation.

Examples of specific host cells useful in the present invention includethe following. These listings should not be construed to be limiting asother host cells known in the art are also useful in the presentmethods, and are encompassed by the present invention.

TABLE 1 References Disclosing E. coli Strain Genomic Sequences TableGenome Entry E. coli Strain Reference Size Number Number (Source ofGenomic Sequence) (Mb) 1 E. coli K-12 Blattner F R, et al. Science 4.6391997 Sep. 5; 277(5331): 1453-62. 2 E. coli MG1655 Blattner F R, et al.Science 4.639 1997 Sep. 5; 277(5331): 1453-62. 3 E. coli BL21 Jeong H met al. J Mol Biol 4.56 (DE3) 2009 Dec. 11; 394(4): 644-52 4 E. coliDH10B Durfee et al. J Bacteriol. 4.69 2008 April; 190(7): 2597-606

TABLE 2 References Disclosing B. subtilis Strain Genomic Sequences TableEntry Reference Number B. subtilis Strain Number (Source of GenomicSequence) 1 B. subtilis 168 Barbe et al. Microbiology. 2009 June; 155(Pt6): 1758-75 2 B. subtilis BSn5 Deng et al. J Bacteriol. 2011 April;193(8): 2070-1

In Tables 3 and 4, the cited genomic sequence reference is for thepermanent curation of the National Center for Biotechnology Information(NCBI). Since eukaryotic cells have multiple chromosomes, their genomicsequences are often spread across multiple publications that eachaddress a specific chromosome or section of a chromosome. NCBI curatesthis information and compiles it under their permanent Entrez Genedatabase.

TABLE 3 References Disclosing S. cerevisiae Strain Genomic SequencesTable Entry S. cerevisiae Reference Number Strain Number (Source ofGenomic Sequence) 1 S. cerevisiae S288c NCBI-Entrez Gene, Nucleic AcidsRes. 2011 January; 39(Database issue): D52-7 2 S. cerevisiae NCBI-EntrezGene, Nucleic Acids Res. AWRI796 2011 January; 39(Database issue): D52-7

TABLE 4 References Disclosing K. pastoris Strain Genomic Sequences (K.pastoris was formerly known as Pichia pastoris) Table Entry K. pastorisReferenee Number Strain Number (Source of Genomic Sequence) 1 K.pastoris Kuberl A, et al. J Biotechnol 2011 CBS7435 2 K. pastoris GS115NCBI-Entrez Gene, Nucleic Acids Res. 2011 January; 39(Database issue):D52-7

TABLE 5 References Disclosing CHO Cell Strain Genomic Sequences TableEntry CHO Cell Number Strain Number Reference 1 CHO-K1 Puck T T, et al.J. Exp. Med. 108: 945- 956, 1958. Xu, et al. Nature Biotechnology 29,735-741 (2011)

Human embryonic kidney (HEK) cells have 4.5 kb of adenovirus 5 DNA inaddition to the human genome. This information can also be found in theNCBI Entrez Gene Database.

TABLE 6 References Disclosing HEK Cell Strain Genomic Sequences TableEntry HEK Cell Reference Number Strain Number (Source of GenomicSequence) 1 HEK 293 NCBI-Entrez Gene, Nucleic Acids Res. 2011 January;39(Database issue): D52-7

The following examples are provided to illustrate various aspects of thepresent invention, and should not be construed as limiting the inventiononly to these particularly disclosed embodiments. The materials andmethods employed in the examples below are for illustrative purposes,and are not intended to limit the practice of the present inventionthereto. Any materials and methods similar or equivalent to thosedescribed herein can be used in the practice or testing of the presentinvention.

EXAMPLE 1 Identification of Host Cell Proteins Associated With aSpecific Product, Histidine-Tagged Green Fluorescent Protein, as aComparative Example

This comparative example demonstrates the identification of proteins ofthe 120 mM imidazole fraction (Ni(II) IMAC) and subsequent genedeletions. It demonstrates how to eliminate host cell contaminants for aspecific target recombinant product, Green Fluorescent Protein (GFPuv),extended by a histidine-rich affinity tag (His₆-GFP). His₆-GFP elutessimilarly to other histidine-tagged proteins found in the literature.While this example discloses three gene deletions that, in principle,would enhance the purity of the desired product, the knockouts of cyoA,adhP, and yfbG and their subsequent lack of expression does notfavorably impact column capacity. These three proteins are insignificantin the metalloproteome of E. coli. Thus, no changes to the separatomeare disclosed that lead to an overall increase in separation efficiency.The text of this example is an annotated version of the inventors' workdescribed in Liu et al. (2009) J. Chromatog. A 1216:2433-2438.

Strains, Plasmids, and Growth Conditions

Escherichia coli BL21 DE3 expressing GFPuv tagged with HHHHHH (His₆)(SEQ ID NO: 1) were constructed using basic molecular biologytechniques. PCR primers F(5′-GCCAAGCTTGTGGCATCATCATCCGCATATGAGTAAAGGAGAAGAACTTTTC-3′) (SEQ IDNO:2) and R (5′-TTGGAATTCATTATTTGTAG AGCT-3′) (SEQ ID NO:3) containingHind III and EcoRI sites (underlined correspondingly), were used toamplify and extend GFPuv. These enzymes were used to digest the PCRfragment and the parent plasmid. T4 DNA ligase was then used toconstruct a new vector that was built from the PCR-extended gene and themajor part of the pGFPuv plasmid. Transfoonants were selected in LB agarcontaining 50 μg/ml ampicillin. E. coli cells were grown inLuria-Bertani (LB) overnight and inoculated in a 2-liter flaskcontaining 500 ml M9 supplemented with 10 g/L glucose such that theinitial A₆₆₀ was 0.1. To express His₆-GFPuv, 4% inoculations ofovernight cultures were made in 500 mL LB and induced with 1 mM of IPTGalter 1-2 hours. Fermentations were carried out at 37° C. and theagitation speed of the shaker was set at 200 rpm. Cell pellets werecollected by centrifugation al 5000 g and frozen at −80° C. before celllysis.

Sample Preparation and Chromatography

Cell pellets were suspended in 20 ml 1 X native purification buffer (50mM NaH₂PO₄, pH 8.0; 500 mM NaCl) combined with 100 μl Triton X-100, 80μl 100 mM MgCl₂, 20 μl phenylmethylsulphonyl fluoride (PMSF) and 100 μl100 mg/mL lysozyme. The mixture was sonicated on ice at 4 W for 30 minusing a Vibra cell ultrasonifier (Fisher Scientific, Pittsburgh, Pa.,USA), and centrifuged at 5000 rpm for 20 min. The supenatants werecollected and passed through a 0.45 μm filter before column loading.

For experiments identifying natural contaminants or to follow theadsorption and elution of His₆-GFP, the cleared lysate was applied to 4ml ProBond nickel-chelating resin in an open column followed byequilibration using 1 X native purification buffer (5 X nativepurification buffer, as supplied with the resin, is comprised of 250 mMNaH2PO4, pH 8.0. 2.5 M NaCl). Step elutions were carried out with nativepurification buffer with the following imidazole concentrations: 60 mM,80 mM, 100 mM, 120 mM, and 200 mM. This was followed by a 500 mM EDTAelution. The elution volumes for each step were 24 ml, or 6 columnvolumes (CVs), and applied at an approximate flow rate of 0.5 ml/min.Fractions were collected and measured for protein concentration with aBCA Protein Assay Kit (Pierce, Rockford, Ill., USA) and/or assayed forGFPuv in triplicate with a Tecan Infinite M200 96-well plate reader withexcitation/emission spectra set to 395/509 nm.

SDS-PAGE and Mass Spectrometry.

Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) wasperformed for 6 hours at 100 V. Gels were stained with Coomassie Blue.The Genomics and Proteomics Core Laboratories at the University ofPittsburgh performed the protein identification. To account for theexperimental accuracy of the measurement, three spots were excised fromeach band and each digested with trypsin. Peptides were separated byliquid chromatography (LC), then identified by tandem mass spectrometry(MS/MS) fragmented by collision-induced dissociation. MASCOT v2.1(Matrix Science, Boston Mass. USA) was used to match LC/MS data with E.coli proteins. For positive identification, spectral data from each ofthe three spots matched.

Functional Prediction of Identified Proteins in 120 mM Elution Fraction.

Functional classification of all identified proteins was based on theProfiling of Escherichia coli chromosome (PEC) database (Hashimoto etal. (2005) Molecular Microbiology 55: 137-149).

Construction of Knockout Mutants.

All the knockout mutants of this Example were generated with the samedeletion system according to the manual accompanying the Quick and EasyE. coli Gene Deletion Kit (Gene Bridges, Heidelberg, Germany). This kituses plasmid pRedET to facilitate homologous recombination events.During the progression of the work, a triple mutant of BL21(ΔcyoAΔyfbGΔadhP) was constructed through a series operation consistingof recombination, selection with kanamycin, confirmation, and removal ofthe selection marker using flipase recognition site (FRT flankedkanamycin gene).

Southern Blot Analysis

DNA probes used for Southern hybridization were prepared fromPCR-amplified fragments. Probes were labeled according to the manual ofAmersham Gene Images Random Prime Labeling Kit (GE Healthcare). GenomicDNA was isolated from knockout mutants using standard protocols. DNAsamples were digested with Bam HI, separated by electrophoresis on 1%agarose gels, transferred to Amersham Hybond-N+ membranes (GEHealthcare), and then baked at 80° C. for 2 hours. The probes werehybridized to these blots and detected according to the protocol of theGene Images ECL Detection Kit (GE Healthcare).

SDS-PAGE Evaluation of CyoA, YfbG and AdhP Knockout in Mutant Strains.

Cell preparations of BL21, mutants, and chromatography fractions wereevaluated by SDS-PAGE. Approximately 15 μg sample/well were loaded intoa 12% acrylamide gel.

Identification of Knockout Candidates and Confirmation of their Deletion

A total extract of E. coli protein was loaded to the ProBondnickel-chelating column using 1 X native purification buffer (5 X nativepurification buffer, as supplied with the resin, is comprised of 250 mMNaH2PO4, pH 8.0. 2.5 M NaCl). Step elutions were carried out with nativepurification buffer with the following imidazole concentrations: 60 mM,80 mM, 100 mM, 120 mM,and 200 mM. FIG. 3 shows the proteinconcentrations in each fraction normalized to the total protein used forcolumn loading. The bar graph indicates order of magnitude changes inthe total protein encountered with each imidazole challenge. Note thatthe elution fraction containing the 120 mM imidazole fraction containedthe least amount of protein. Coincidentally, this fraction that containslow host cell protein is also the fraction where His₆-GFPuv elutes.

SDS-PAGE and LC-MS/MS were used to identify the cellular proteinspresent in the concentrated sample of pooled 120 mM imidazole elutionfractions. A total of 18 proteins were identified (Table 7), with cyoA,yfbG, and adhP selected for deletion due to lack of essentiality.Southern blot analysis and gel electrophoresis indicated lack ofexpression of the three gene products cyoA, yfbG, and adhP. FIG. 4 showsthis confirmation due to lack of spots associated with positivehybridization and bands of the molecular weights of these products,respectively.

TABLE 7 Proteins eluted at 120 mM from a Ni(II)—NTA column dnaK yfbGadhP cyoA rplB slyD nagD ahpC rpsG rplO rpsE rplM Fur Hypotheticalprotein ECs2542 rplJ rpsL Hns rplL

These results demonstrate that it is possible to apply a limited set ofdata and to produce a knockout strain capable of enhancing the purity ofa recombinant peptide, polypeptide, or protein. It is used as acomparative example to illustrate the lack of a rigorous methodology toidentify specific changes to the host cell that lead to an alteredseparatome capable of broadly improving separation efficiency, andcolumn capacity in particular, regardless of desired recombinantproduct.

EXAMPLE 2 Construction of the Ion Exchange Separatome of E. coli and ItsUse to Design and Build Novel Host Strains for a Common ChromatographyResin

This example describes the process by which a separatome is constructedfor a chromatography resin and subsequently used to guide modificationsto E. coli to increase chromatographic efficiency. It begins bydescribing how data are acquired by fractionating an extract derivedfrom fed batch growth over a DEAE ion exchange bed, and continues byconstructing the separatome—a data structure that includes informationon the genes responsible for identified proteins coupled to aquantitative scoring to rank order molecular biology efforts that leadto a reduced separatome. Finally, construction of example strains isdescribed, concluding with information regarding high priority strainmodifications necessary for significant gains in separation efficiencythrough their deletion, modification, or inhibition.

Cloning Strains and Vectors

E. coli strain MG1655 (K-12 derivative) was selected as the base strainfor cell line modification because of its widespread use and lack ofcommercial license. Its genotype is F lambda rph⁻¹, meaning that itlacks an F pilus, the phage lambda, and has a 15 codon frame-shift asresult of the rph 1 bp deletion (Yale University. E. coli Genetic StockCenter Database. 2013). This frame-shift interrupts the pyrE gene andreduces pyrimidine levels (Jensen et al. (1993) Journal of Bacteriology175(11):3401-7).

Plasmid pKD46 was used as pan of the λ-red recombination system. Thisplasmid is ampicillin resistant and replication is temperaturesensitive. For plasmid maintenance, growth is at 30° C. and the plasmidcan be removed by growth at 37° C. without antibiotic pressure. Theplasmid encodes for lambda Red genes exo, bet, and gam, and includes anarabinose-inducible promoter for expression (Datsenko et al. (2000) PNAS97(12):6640-5). The plasmid was provided in conjunction with MG1655 fromthe Yale E. Coli Genetic Stock Center (New Haven, Conn.).

Expression Strains and Vectors

E. coli strain BL21 (DE3) was used for initial cell culture and celllysate preparation. Its genotype is F . . . ompT hsdSB(rB . . . , mB . .. ) gal dcm (DE3). The strain and genotype was provided by Novagen(EMD-Millipore/Merck). The cell line was transformed with a recombinantpGEX plasmid provided by Dr. Joshua Sakon (Department of Chemistry,University of Arkansas). This plasmid, pCHC305, contains the geneticinformation for the recombinant fusion protein,glutathione-S-transferase—parathyroid hormone—collagen binding domain(GST-PTH-CBD, 383 amino acids).

Storage Strains and Vectors

For storage of DNA constructs, E. coli strain DB5α was selected. Itsgenotype is F-, Δ(argF-lac)169 φ80dlacZ58(M15) ΔphoA glnV44(AS)8λ-deoR481 rfbC gyrA96(NalR)l recA1 endA1 thiE1 hsdR17. DH5 is anon-mutagenized derivative of DH1, which transforms more efficiently dueto a deoR mutation. The recA mutation eliminates homologousrecombination and minimizes undesired modification to stored plasmids.

pUC19 was used as a DNA storage vector. It is a high copy number plasmidthat carriers ampicillin resistance. This plasmid was provided inconjunction with DH5α from the Yale E. Coli Genetic Stock Center (NewHaven, Conn.).

Liquid Growth Media

M9 medium was used where a minimal defined medium was required. M9Medium was made in 3 separate stock solutions: glucose solution (500g/L), trace elements (2.8 g of FeSo4-7H2O, 2 g of MnCl2-4 H2O, 2.8 g offCoCl2-7H2O, 1.5 g of CaCl2-2 H2O, 0.2 g CuCl2-2 H2O, 0.3 g of ZnSO4-7H2O), and 5 x M9 (75 g of K2HPO4, 37.5 g of KH2PO4, 10 g of citric acid,12.5 g of (NH4)2SO4, 10 g of MgSO4-7 H2O). Each of these components mustbe autoclaved individually to minimize salt precipitation. To prepare 1L of M9, 14.5 ml of the glucose solution is mixed with 1 ml traceelement solution, 200 ml of 5 x M9, and enough water to bring the finalvolume up to 1 L (approximately 784.5 ml).

Where rich medium was required, Luria-Burtani (LB) Medium was used. LBpowder was purchased from Difco and was prepared per the manufacturer'sinstructions: 20 g LB powder per 1 L of MILLI-Q® water (ultrapure waterin agreement with the quantitative specifications of Type I water asdescribed in ISO 3696, ASTM D1193, and of EP and USP Purified Water, aswell as the CLSI®-CLRW).

Solid Growth Media

Solid M9 medium was prepared as previously described for liquid M9 withthe addition of agar to the water and concentrated M9 solution prior toautoclaving. To prepare 500 ml of M9 agar, 7.5 g agar, 100 ml of 5 x M9solution, and 300 ml of water are mixed and autoclaved. Added to this is7.25 ml sterile glucose solution (700 g/L), 500 μl trace elements, andenough sterile water to bring the final volume up to 500 ml. The othersolid medium used was LB agar, which was prepared the same as the LBliquid medium described earlier plus the addition of 7.5 g agar perliter.

Fed-Batch Cultivation

Fed-batch cultivation was used to prepare the cell lysate for use indownstream protein purification and identification of natively expressedproteins. The cell line used was BL21 pCHC305. To begin fermentation, asingle colony was isolated from a LB ampicillin agar plate andtransferred to a 5 ml culture tube containing liquid LB plus 150 μg/mlampicillin. This culture tube was allowed to incubate overnight at 37°C. After overnight growth, the 5 ml culture tube is supplemented with100 ml of M9 with ampicillin and allowed to grow at 37° C. for six toeight hours. This 100 ml culture is then centrifuged at 4750 rpm for 25minutes (Beckman Coulter Allegra) and re-suspended in 50 ml of fresh M9medium with 150 μg/ml ampicillin. This culture is used as the inoculantfor the fed-batch growth. The 3-liter Applikon bioreactor (Foster City,Calif.) contained 1 liter of M9 plus 150 μg/ml ampicillin and 1 mlsilicone anti-foam.

The Applikon was equipped with BioXpert Advisory software from Applikon,an Applisense pH probe, and a dissolved oxygen probe. To maintain properdissolved oxygen, the reactor was supplemented with a compressed oxygencylinder with a controllable flow rate. To insure effective gasdispersal, the culture was initially stirred at 750 rpm and was laterincreased to 1000 rpm based on cell density. Adjustments in oxygendelivery were made as necessary during the process to ensure that thedissolved oxygen concentration did not drop below 35%. The pH wasmaintained at approximately 6.8 (with a range of 6.75 to 7) during thecultivation by adding 7M NH₄OH as needed. Temperature was maintained at37° C. using a heating jacket and cooling loop. Optical densities weremonitored using a BUGEYE™ optical density probe (BUGLAB®, Foster City,Calif.) and a DU800 Beckman Coulter spectrophotometer (Brea, Calif.).BUGEYE™ is a non-invasive optical biomass measuring device for measuringbiomass through the side wall of a shake flask using a handheld sensor,described in U.S. Pat. No. 8,405,033. A linear correlation for theBUGEYE™ response to the actual optical density (OD) as measured by thespectrophotometer was determined for each individual experiment.

The fed-batch fermentation process has two phases, a batch phase and afeeding phase. In the batch phase, the culture uses only the carbonsources provided in the media at the start of the cultivation and nonutrients are fed to the reactor. This phase lasts approximately 7-8hours, depending on the lag phase of the culture and how rapidly theculture grew on the initial carbon substrate. The shift from batch phaseto feeding phase can be determined by two indicators, a rise in pH and asharp decline in oxygen concentration, which indicate that the initialcarbon substrate has been depleted. In the fed-batch experiments, thesetwo events occur simultaneously and are displayed by the Applikonsoftware. The feeding profile used for fermentation experiments is basedon that of a collaborator (McKinzie Fruchtl) and was originally proposedby Korz et al. (1995) Journal of Biotechnology 39(1):59-65 and Lee etal. (1996) Trends in Biotechnology 14(3):98-105. A feeding profile wasprogrammed into the Applikon software that mimics the exponential feedbased on substrate concentrations.

An exponential fed-batch fermentation method commonly used topre-determine the amount of glucose that should be fed into the reactorto achieve a certain growth rate was proposed by Korz et al. and Lee etal., supra:

${M_{s}(t)} = {{{F(t)}{S_{F}(t)}} = {{( {\frac{\mu}{Y_{X/S}} + m} ){X(t)}{V(t)}} = {( {\frac{\mu}{Y_{X/S}} + m} ){X( t_{F} )}{V( t_{F} )}\exp^{\mu {({t - t_{F}})}}}}}$

where M_(s) is the mass flow rate (g/h) of the substrate, F is thefeeding rate (l/h), S_(F) is the concentration of the substrate in thefeed (g/l), μ is the specific growth rate (l/h), Y_(X/S) represents thebiomass on substrate yield coefficient (g/g), m is the maintenancecoefficient (g/g h), and X and V represent the biomass concentration(g/l) and cultivation volume (l), respectively. The yield coefficientfor E. coli on glucose is generally token to be 0.5 g/g (Korz et al.,supra; Shiloach et al. (2005) Biotechnology Advances 23(5):345-57). Themaintenance coefficient is often 0.025 g/g h (Korz D J et al., supra.This equation has been widely adapted for fed-batch fermentationprocesses, as exponential feeding allows cells to grow at a constantrate (Kim et al. (2004) 26(3):147-50).

During fed-batch fermentation, the cells were left un-induced to preventthe addition of the recombinant protein to the native protein pool. Thefermentation was allowed to grow for a total of 24 hours frominoculation to harvest. At the end of the fermentation process, cellswere harvested from the reactor by pumping the reactor contents intocentrifuge bottles. The reactor contents were then centrifuged at 12,000xg for 30 minutes at 5° C. (Beckman Coulter Avanti, JLA-10,500 fixedangle rotor) to separate the cell pellet from the media. The pellet wasseparated into four 50 ml conical bottom tubes for storage at −20° C.

Lysate Preparation

One of the 50 ml pellets (58.9 g) was re-suspended in 150 ml of 25 mMTris buffer, pH 7. To enable cell lysis, 2 mg/ml lysozyme were added tothe mixture. In addition, 1 mM phenylmethylsulphonyl fluoride (PMSF), 20μg/ml aprotinin, and 1 mM ethylenediamine-tetraacetic acid (EDTA) wasadded to minimize protein degradation. The mixture was then incubaled onice with stirring for 30 minutes to lyse the cells. The mixture was thencentrifuged at 50,000 xg (Beckman Coulter Avanti, JA-25.50 fixed-anglerotor) for 30 minutes at 5° C. to separate the proteins from the celldebris.

The proteins in the supernatant were carefully pipetted out of thecentrifuge tubes, to minimize contaminants from the insoluble fraction,and were clarified by syringe filtration through 0.45 μm celluloseacetate. Lastly, the total protein concentration of the cell lysate wasdetermined by using a Bio-Rad DC Protein Assay which is a detergentcompatible colorimetric assay that is read by spectrophotometer at 750nm (Beckman Coulter DU 800 HP). Bovine serum albumin standards were usedto determine the baseline correlation between protein concentration andabsorbance at 750 nm.

Fast Protein Liquid Chromatography

Fast protein liquid chromatography (FPLC) was used to separate thenatively expressed proteins into groups based on the salt concentrationat which they elute, which correlates to their surface charge. Thechromatography was performed using an Amersham ÄkTA FPLC. The systemconsists of dual syringe pumps (P-920), gradient mixer, a monitor(UPC-900) for UV (280 nm), pH and conductivity, a fraction collector(Frac-900) and UNICORN® V3.21 data collection and archive software.

Resin

For the initial database development, diethylaminoethyl cellulose (DEAE)was selected as the ion exchange (IEX) resin due to its prevalence ofuse in industrial manufacturing. Specifically, the column used was a 1ml HiTrap DEAE FF from GE Healthcare. DEAE is a weak anion exchanger,meaning that it is a positively charged matrix with a narrow working pHof 2-9 (GE Healthcare. Instructions 71-5017-51 AG HiTrap ion exchangecolumns. 1-24.).

Buffer Composition

25 mM Tris buffer, pH 7, was selected for all of the FPLC purificationsteps. The loading buffer contained 10 mM NaCl to minimize non-specificbinding (Buffer A). The elution buffer contained 1 M NaCl, which issufficient to desorb bound proteins (Buffer B).

Column Loading Conditions

Prior to loading the column, the system was washed with buffer A untilequilibrium was achieved (roughly 10 ml). At this point, all systemmonitors were base-lined. The column was loaded at 10% breakthrough asper industry standard. The amount of total lysate to be applied to thecolumn to achieve this breakthrough was determined as follows. Accordingto GE Healthcare, the dynamic binding capacity (DBC) of HiTrap DEAE FFis 110 mg HSA (human scrum albumin)/ml solvent (resin). This numbergives the amount of protein that can be bound per milliliter of resin.The next step was to determine what percentage of the native proteinsbound to the DEAE resin at pH 7. To do this, 5 ml of lysate was loadedon the column and washed with 10 ml of buffer A. The flow-through wascollected in a single fraction. The column was then washed with thebuffer B and the resulting flow-through was collected. Both fractionswere then analyzed for their total protein concentration using thepreviously mentioned Bio-Rad assay. The amount of lysate (ml) to loadonto the column was determined by the following equation:

${{lysate}\mspace{14mu} ({ml})} = \frac{D\; B\; C*( {1 + \%_{BT}} )*V_{C}}{\%_{bound}*C_{1}}$

where DBC is the dynamic binding capacity of the resin (mg/ml), %_(BT)is the desired percent breakthrough, V_(c) is the volume of the column(ml), %_(bound) is the percent of the total lysate that binds to theresin, and C₁ is the protein concentration of the lysate (mg/ml).

The column was loaded at 1 ml/min and then washed with 10 column volumes(CV) of buffer A to remove any unbound proteins. The unbound fractionwas collected for later analysis.

Column Elution Conditions

To identify where the bulk of the bound proteins eluted, the proteinswere desorbed through roughly 100 mM salt steps from 10 mM to 1 M. Thisprocess allows for the identification of the priority salt fractionsthat need to be spaced out into smaller steps for later analysis.

TABLE 8 10% Elution Steps Step # % B NaCl (mM) Step Length (CV) wash  0%10 10 1 10% 109 5 2 20% 208 5 3 30% 307 5 4 40% 406 5 5 50% 505 5 6 60%604 5 7 70% 703 5 8 80% 802 5 9 90% 901 5 10  100%  1000 5 clean 100% 1000 5

The flow rate was maintained at 1 ml/min and the pressure limit was setto 0.5 MPa for the duration of the experiment. During elution, allfractions were collected and immediately stored at 2° C. to reduceprotein degradation. After all of the proteins have been desorbed in the1000 mM step, the traction collector is stopped and the column iscleaned with buffer B to ensure all proteins have been desorbed andwashed out of the column. The column is then washed with sufficientbuffer A to re-equilibrate the column.

For finer focusing on the primary elution windows, smaller 5% steps areused (Table 9). In this instance, the focus was on the 10 mM to 500 mMwindow.

TABLE 9 5% Elation Steps Step # % B NaCl (mM) Step Length (CV) wash  0%10 20 1  5% 59.5 15 2 10% 109 15 3 15% 158.5 15 4 20% 208 15 5 25% 257.515 6 30% 307 15 7 35% 356.5 15 8 40% 406 15 9 45% 455.5 15 10  50% 50515 wash 100%  1000 20

Analytical Assays Sample Processing

Prior to the samples undergoing further analysis, they were concentratedusing a GE Lifesciences VIVASPIN™ 20 (5,000 MWCO). VIVASPIN™ is acentrifugal membrane ultrafiltration sample concentrator employing asemipermeable membrane with a molecular weight cutoff selected by theuser for non-denaturing concentration of biological samples by membraneultrafiltration. Centrifugation is applied to force solvent through themembrane, leaving a more concentrated sample in the upper chamber of thedevice. This reduced the 20 ml fractions to 2 ml total volume. This wassplit into two 1 ml samples, one was sent for LC-MS/MS, and the otherwas kept for SDS-PAGE.

Protein Gels—SDS-PAGE

Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) wasused to observe the approximate number of proteins in each FPLC saltfraction and their molecular weight. Prior to SDS-PAGE, the samples weredesalted by buffer exchange. To do this, the previously mentioned 1 mlsample of the desired fraction was concentrated in a GE LifesciencesVIVASPIN™ 2 (5,000 MWCO) and re-suspended in 25 mM Tris buffer, pH 7.The concentration and re-suspension process was repeated two more timesto ensure all salt had been removed. After the last concentration step,the sample was left in its concentrated form to be loaded onto theSDS-PAGE.

A Bio-Rad PROTEAN® II system (large format vertical electrophoresis cellused for common electrophoretic techniques such as SDS-PAGE, nativeelectrophoresis, and agarose gel electrophoresis) was used for theelectrophoresis with SDS buffer. The SDS buffer is made as a 10 X stock,where the 1 x running buffer is 25 mM Tris, 192 mM glycine, and 0.1% SDSat a pH of 8.6. For visualization of the chromatography samples, a 12.5%gel was used. The samples are mixed 5:1 with a 5 x loading dye.

Electrophoresis was carried out at 100V until the sample was through thestacking gel, then increased to 140V. Average run time was around 1hour. Gels were stained using a Coomassie Blue stain (40% methanol, 10%acetic acid and 0.5% Coomassie blue) for 3 hours and then de-stainedwith a 10% acetic acid and 40% methanol solution. Gel images werecaptured by scanning on a computer flatbed scanner.

Liquid Chromatography Mass Spectroscopy (LC-MS/MS)

Samples of each FPLC salt fraction were sent to Bioproximity (Chantilly,Va.) for protein identification via liquid chromatography massspectroscopy (LC-MS/MS). The protocol for the LC-MS/MS was provided byBioproximity as follows.

Protein Denaturatlon and Digestion

Prior to digestion, proteins were prepared using the filter-assistedsample preparation (FASP) method (Wiśniewski et al. (2009) 6(5):359-62).Next, the sample was mixed with 8 M urea, 10 mM dithiothreitol (DTT), 50mM Tris-HCl at pH 7.6 and sonicated briefly. Samples were thenconcentrated in a Millipore AMICON® Ultra (30,000 MWCO) device (acellulose membrane centrifugal filter unit for concentrating biologicalsamples) and centrifuged at 13,000 xg for 30 min. The remaining samplewas buffer exchanged with 6 M urea, 100 mM Tris-HCl at pH 7.6, thenalkylated with 55 mM iodoacetamide. Concentrations were measured using aQUBIT® fluorometer (Invitrogen) for quantitating DNA, RNA, and proteinsusing fluorescent dyes that emit signals only when bound to specifictarget molecules. The urea concentration was reduced to 2 M, trypsin wasadded at a 1:40 enzyme to substrate ratio, and the sample incubatedovernight on a THERMOMIXER® temperature-controlled device used formixing liquids in closed micro- and larger test tubes, and micro testplates (Eppendorf) at 37 C. The AMICON® was then centrifuged and thefiltrate collected.

Peptide Desalting

Digested peptides were desalted using C18 stop-and-go extraction (STAGE)tips (Rappsilber et al. (2003) Analytical Chemistry. American ChemicalSociety 75(3):663-70). For each sample, the C18 STAGE tip was brieflyactivated with methanol, and then conditioned with 60% acetonitrile and0.5% acetic acid, followed by 2% acetonitrile and 0.5% acetic acid.Samples were loaded onto the tips and desalted with 0.5% acetic acid.Peptides were eluted with a 60% acetonitrile, 0.5% acetic acid solutionand dried in a vacuum centrifuge (Thermo Savant).

Liquid Chromatography-Tandem Mass Spectrometry

Peptides were analyzed by LC-MS/MS. LC was performed on an Easy-nanoLCII HPLC system (Thermo). Mobile phase A was 94.5% MILLI-Q® water, 5%acetonitrile, 0.5% acetic acid. Mobile phase B was 80% acetonitrile,19.5% MILLI-Q® water, 0.5% acetic acid. The 120 min LC gradient ran from2% B to 50% B over 90 min, with the remaining time used for sampleloading and column regeneration. Samples were loaded to a 2 cm×100 umI.D. trap column positioned on an actuated valve (Rheodyne). The columnwas 13 cm×100 μm I.D. fused silica with a pulled tip emitter. Both trapand analytical columns were packed with 3.5 μm C 18 resin (Magic C18-AQ, Michrom). The LC was interfaced to a dual pressure linear iontrap mass spectrometer (LTQ Velos, Thermo Fisher) via nano-electrosprayionization. An electrospray voltage of 2.4 kV was applied to apre-column tee. The mass spectrometer was programmed to acquire, bydata-dependent acquisition, tandem mass spectra from the top 15 ions inthe full scan from 400-1400 m/z. Dynamic exclusion was set to 30seconds.

Data Processing and Library Searching

Mass spectrometer RAW data files were converted to MGF (Mascot genericformat) using msconvert (Kessner et al. (2008) Bioinformatics24(21):2534-6). Detailed search parameters are printed in the searchoutput XML (extensible markup language) files. All searches requiredstrict cryptic cleavage, up to three missed cleavages, fixedmodification of cysteine alkylation, variable modification of methionineoxidation and expectation value scores of 0.01 or lower. Searches usedthe sequence libraries: UniProt Escherichia coli (strain B/BL21-DE3, TheUniProt Consortium (2012) Nucleic Acids Research 40(Databaseissue):D71-5), the common Repository of Adventitious Proteins (cRAP)(The Global Proteome Machine. Common Repository of AdventitiousProteins. Jan. 1, 2012) and the given sequence for plasmid productGST-PTH-CBD. MGF files were searched using X!!Tandem (Craig et al.(2004) Bioinformatics 20(9):1466-7) using both the native and k-score(MacLean et al. (2006) Bioinformatics 22(22):2830-2) scoring algorithmsand by the Open Mass Spectrometry Search Algorithm (OMSSA) (Geer el al.Journal of Proteome Research 3(5):958-64). All searches were performedon Amazon Web Services-based cluster compute instances using theProteome Cluster interface. XML output files were parsed andnon-redundant protein sets determined using MassSieve. Proteins wererequired to have two or more unique peptides across the analyzed sampleswith E-value scores of 0.01 or less, 0.001 for X!Hunter and proteinE-value scores of 0.0001 or less.

Protein Quantitation

Proteins were quantified the spectral counting method (Liu et al. (2004)Analytical Chemistry 76(14):4193-201). This results in a hit count,which is approximate of protein concentration in the sample.

Database Construction Compilation of Data

The received LC-MS/MS data was imported into Microsoft Access 2010 fordata management. The ECOGENE®'s EcoTools Database Table Download (RuddKE. Database Table Download ECOGENE® 3.0. Department of Biochemistry andMolecular Biology R-629, University of Miami Miller School of Medicine;2012) was used to supplement the received LC-MS/MS data with additionalgenomic and proteomic data. The data added were: the protein length (inamino acids), direction of replication (clockwise or counterclockwise),left end position of the gene (in base pairs), right end position of thegene (in base pairs), molecular weight of the protein, common gene name,synonym gene name, protein name, protein function, description, GenBankGI ID(Benson et al. (2013) GenBank. Nucleic acids Research 41(Databaseissue):D36-42) and UniProtKB/Swiss-Prot ID (The Uniprot Consortium(2012) Nucleic acids Research 40(Database issue):D71-5). The ECOGENE®Cross Reference Mapping and Download tool was used to Bnum (Blattnernumber) (Blattner et al. (1997) Science 277(5331):1453-62). MicrosoftAccess was used to build relationships between the various datasets thatallowed for searches across the compiled database.

Gene essentiality data were retrieved from Gerdes et al. (2003) J.Bacteriol. 185(19):5673-84) which compiles gene essentiality from theirown research as well as the Profiling of E. coli Chromosome (PEC)database (Hashimoto et al. (2005) Molecular Microbiology 55(1):137-49;Kato and Hashimoto (2007) Molecular Systems Biology 3(132):132; and KangY et al. (2004) J. Bacteriol. 186(15):4921-30).

Data Manipulation

In order to determine the priority of genes to be deleted, each gene wasgiven a score for each elution window. This score or importancecriterion was defined by:

${importance}_{i} = {\sum_{j}\lbrack {{b_{1}( \frac{y_{c_{j}}}{y_{\max}} )}( \frac{h_{i,_{j}}}{h_{i,{total}}} )( \frac{h_{i,_{j}}}{h_{j,{total}}} )( \frac{M\; W_{i}}{M\; W_{ref}} )^{\alpha}} \rbrack_{i}}$

with the following definitions: y_(cj) and y_(max)=concentration ofmobile phase eluent in fraction (j) and maximum value, respectively; andh_(i,j) and h_(i,total)=the amount of protein (i) in fraction (j) andtotal bound protein (i), respectively; and h_(j,total)=total amount ofprotein in fraction (j).

In the function given, the score can range from 1 (high) to 0 (low). Thesummation ranges over the desired elution windows (j) and can beadjusted to cover all of the windows, or target a select few. The firstratio accounts for adsorption strength with y_(cj) being theconcentration of the elution solvent (in the case of ion exchange, thisis NaCl) and y_(max) being the maximum solvent concentration. The secondratio accounts for adsorption specificity with h_(i,j) being the proteinconcentration in the window, over the total protein concentration in allwindows (h_(i,total)). For proteins that elute in only one window, thisvalue will be 1, where proteins that elute in multiple windows will havea lower ratio. The third ratio describes the relative amount a proteinhas in a given fraction, and the forth ratio accounts for thepossibility of steric hindrance.

The second and third equations define how much capacity is recoveredwhen the protein is removed, and the overall capacity recovery as onemodifies, deletes, or inhibits n genes, respectively

recovery  potential_(i) = h_(i, total)/h_(total, m s) and${{capacity}\mspace{14mu} {recovery}} = {100\% \mspace{14mu} x\mspace{11mu} {\sum\limits_{i = 1}^{n}{{recovery}\mspace{14mu} {potential}_{i}}}}$

Homologous Recombination

Removal of Genomic thyA

Flexible Recombineering Using Integration of thyA (FRUIT) as describedby Stringer et al. ((2012) PloS one. 7(9):e44841), a modification of theDatsenko λ-Red homologous recombination system (Datsenko et al. (2000)PNAS 97(12):6640-5), was used to delete the targeted genes from thegenome of E. coli strain MG1655. The method begins by creating an MG1655ΔthyA strain by swapping the gene for an oligonucleotide designed tohave 60 bp of homology at the beginning and the end of the thyA gene.FIG. 5 shows the process by which this deletion is performed.

This oligonucleotide was ordered as two linear single stranded DNAfragments from Integrated DNA Technologies (Coralville, Iowa). Thefragments were hydrated in Qiagen EB buffer (Tris, pH 8, 1.4M NaCl) andmixed in a 1:1 ratio. The mixture was then placed in a MJ ResearchPTC-200 DNA Engine thermocycler that was programed to heat to 98° C. andthen drop the temperature by 2° C. every 30 seconds until it reached 25°C.

To Delete thyA

MG1655:pKD46 was cultured overnight at 30° C. in LB plus ampicillin (100μg/ml). The following morning, the overnight was sub-cultured 1:100 into5 ml of fresh LB-ampicillin with 0.2% L-arabinose and allowed to growfor approximately three hours, until the culture reached an OD₆₀₀ (HPDU800) of 0.6 to 0.8. All proper controls were also taken to validatethe recombination event. To prepare the cells for electroporation, the 5ml induced culture was split into four 1 ml aliquots and moved to 1.5 mlmicrofuge rubes. The final 1 ml was refrigerated for later analysis orfor further sub-culturing. The microfuge tubes were centrifuged for 60seconds at 14,000 rpm in a cooled (placed in refrigerator, 2° C.)compact bench-top microfuge centrifuge (Eppendorf, MINISPIN®). Thesupernatant was discarded by gently pouring off the liquid and then thepellet was placed on ice. The pellet was then re-suspended in 1 ml ofchilled distilled/deionized water (ddH₂O) and then centrifuged again.This process was repeated once more. After the supernatant is poured offthe final time, there is roughly 100 μl of liquid left in the tube.Next, the cells are re-suspended in the remaining fluid and kept on ice.To this, the prepared linear fragment is added, in this case the thyAdeletion template, various concentrations, usually ranging from 200-1000nmol. This mixture was then pipetted into chilled sterileelectroporation cuvettes (Bio-Rad, 0.1 cm gap). The sample was theelectroporated using a Bio-Rad MICROPULSER™ set to Ecl (E. coli, 0.1 cmcuvette, 1.8 kV, one pulse). The Bio-Rad MICROPULSER™ is an apparatusused for the electroporation of bacteria, yeast, and othermicroorganisms where a high voltage electrical pulse is applied to asample suspended in a small volume of high resistance media, consistingof a pulse generator module, a shocking chamber, and a cuvette withincorporated electrodes. Next, 1 ml of LB containing ampicillin (50μg/ml), thymine (100 μg/ml) and trimethoprim (20 μg/ml) (LB-amp-thy-tri)was gently added directly to the cuvette before incubating the sample at30° C. with shaking for 3 hours. Since the strain now lacks thyA, it isnecessary to supplement the media with thymine. Trimethoprim acts as asecondary selector because if the strain still contains an active thyAgene, the trimethoprim is toxic. After that time, the cultures werestreaked out onto LB-ampicillin-thymine-trimethoprim agar plates andallowed to incubate at 30° C. overnight. In addition, 250 μl of eachculture were sub-cultures into 5 ml of ontoLB-amptcillin-thymine-trimethoprim and incubated overnight at 30° C.with shaking.

Gene Deletion

The gene deletion protocol is a two-step process. The first step usesthyA as a selection marker that disrupts the targeted gene. The secondstep removes thyA from the genome again, following a similar protocol asdescribed previously. For the first step, strain LTS00 is grownovernight in LB-amp-thy-tri and is sub-cultured 1:100 the followingmorning into 5 ml LB-amp-thy-tri plus 0.2% L-arabinose. These cells areallowed to grow for approximately 6 hours (growth is significantlydiminished when lacking thyA) until the OD₆₀₀ reaches 0.6 to 0.8. Thecells are then prepared for electroporation as described previously.Prior to electroporation 2 μl of the PCR product containing the thyAgene with homology to the gene to be deleted is added to the sample.Electroporation follows the same protocol as earlier. Afterelectroporation, 1 ml of LB with ampicillin (50 μg/ml) is added, and thecelts are allowed to incubate for 3 hours at 30° C. with shaking. Afterthat time, the cultures were streaked out onto LB-ampicillin (150 μg/ml)agar plates and allowed to incubate al 30° C. overnight. In addition,250 μl of each culture were sub-cultures into 5 ml of onto LB-ampicillin(150 μg/ml) and incubated overnight at 30° C. with shaking.

FIG. 6 shows the process by which the selection marker is used to causea gene deletion. Step 1 creates the intermediary thyA+ strain, where thetarget gene has been deleted but the selection marker remains. At thispoint, the cell is able to survive on thymine-depleted media. Step 2removes the thyA marker so mat it can be used again for future genedeletions. The protocol is the same as that for the removal of thyA butthe 120 bp oligo has homology to the new gene target and removes thyAand its promoter.

Successful deletion of the gene was confirmed via PCR amplification ofthe deleted region and agarose gel electrophoresis. The amplifiedregions were also sent for genomic sequencing to further confirm thatthe homologous recombination event successfully occurred.

TABLE 10 Deletion Fragments. Gene Name Target Sequence thyAdt thyAGCAAAATTTCGGGAAGGCGTCTCGAAGAATTTAA CGGAGGGTAAAAAAACCGACGCACACGTGTTGCTGTGGGCTGCGACGATATGCCCAGACCATCATGAT CACACCCGCGACAATCAT (SEQ ID NO: 4)metHdt metH TTTGTTGAATTTTTATTAAATCTGGGTTGAGCGTGTCGGGAGCAAGTGCTGGGGTATGACGCGGACTG ATTCACAAATCTGTCACTTTTCCTTACAAC(SEQ ID NO: 5) entFdt entF GGCGTACTCTGACACCGACGAATTTTACCCAGTTGCAGGAGGCACACGCGCAACGCTAAACAGGTAAA TTAATATTATTTATAAACCCATAATTAC(SEQ ID NO: 6) tgtdt tgt CGCTGGTTTAAAACGTTGGACTGTTTTTCTGACGTAGTGGAGAAAAACCACCTTTGAACGTTGATTAA TATTAATAATGAGGGAAATTTAATGAGCT(SEQ ID NO: 7) rnrdt rnr GTGGAGTGACGAAAATCTTCATCAGAGATGACAACGGAGGAACCGAGAAGAAAAAAGTGGCAGAGTGA TCAATACCCTCTTTAAAAGAAGAGGGTTA(SEQ ID NO: 8) ycaOdt ycaO TAAAACCCGTATTATTGCGCGCTTTCCGTACGACTAAAGTGATTTTCGCAGCATTCTGGGCAAAATAA AATCAAATAGCCTACGCATGTAGGCTTA(SEQ ID NO: 9)

These results demonstrate how the separatome can be defined for achromatographic technique, ion exchange in particular, and can be usedto design and construct novel host cells that have certain genesdeleted, modified, or inhibited. For example. Table 11 describes tenseparated E. coli MG1655 derivatives that have one or more genedeletions associated with high affinity host cell proteins. Thesestrains in their current form can be used to express a targetrecombinant protein and will have enhanced separation efficiency, columncapacity in particular, as these proteins are contained in severalfractions of high salt concentration.

TABLE 11 E. coli Deletion Strains Name Genotype MG1655 Wild Type: F-,λ⁻, rph-1 LTS00 ΔthyA LTS01+ ΔmetH LTS01 ΔthyAΔmetH LTS02+ ΔmetHΔentFLTS02 ΔthyAΔmetHΔentF LTS03+ ΔmetHΔentFΔtgt LTS03 ΔthyAΔmetHΔentFΔtgtLTS04+ ΔmetHΔentFΔtgtΔrnr LTS04 ΔthyAΔmetHΔentFΔtgtΔrnr LTS05+ΔmetHΔentFΔtgtΔrnrΔycaO

Table 12 lists high priority genes for DEAE ion exchange media. Futurestrains of the LTS series of Table 11 will have additional genesidentified in Table 12 deleted, modified, or inhibited as the recoverycapacity is pushed towards higher values.

TABLE 12 High priority genes of the DEAE separatome, loading pH 7GeneName rpoC rpoB hldD metH entF mukB tgt rnr glgP recC ycaO glnA ptsImetE sucA hrpA groL gatZ speA thiI nusA tufA degP clpB rapA metL ycfDnagD ilvA fusA cyaA gldA dnaK ygiC gyrA glnE carB ppsA degQ usg ilvBthrS recB entB dusA typA prs cysN atpD purL

The invention being thus described, it will be obvious that the same maybe varied in many ways. Such variations are not to be regarded as adeparture from the spirit and scope of the invention, and all suchmodifications as would be obvious to one skilled in the art are intendedto be included within the scope of the following claims.

What is claimed is:
 1. An isolated host cell for expression of a targetrecombinant peptide, polypeptide, or protein, wherein thechromatographic separation efficiency of said target recombinantpeptide, polypeptide, or protein expressed in said isolated host cell isimproved in an amount in the range of from about 3% to about 50%,wherein the genome of said isolated host cell: i) is a reduced genome,ii) a modified genome, or iii) a genome in which expression of genes isreduced or completely inhibited; wherein genes that are deleted,modified, or the expression of which is reduced or completely inhibitedin said isolated host cell code for isolated host cell proteomepeptides, polypeptides, or proteins that impair the chromatographicseparation efficiency of said target recombinant peptide, polypeptide,or protein expressed in said isolated hast cell; wherein said genes areidentified, quantified, scored, and ranked according to chromatographicimportance in adversely affecting the chromatographic separationefficiency of said target recombinant peptide, polypeptide, or protein;and wherein deletion, modification, reduction of expression, or completeinhibition of expression of said genes in said isolated host cellimproves the chromatographic separation efficiency of said targetrecombinant peptide, polypeptide, or protein expressed in said isolatedhost cell in an amount in the range of from about 3% to about 50%compared to the chromatographic separation efficiency of said targetrecombinant peptide, polypeptide, or protein expressed in said isolatedhost cell when said genes are not deleted, not modified, or theexpression of which is not reduced or completely inhibited in saidisolated host cell, respectively.
 2. The isolated host cell of claim 1,wherein said genes are identified and quantified by: 1) fractionating(a) a lysate of said isolated host cell in the case where said targetrecombinant peptide, polypeptide, or protein is not secreted from saidisolated host cell, or (b) the culture medium in which said isolatedhost cell is grown in the case where said target recombinant peptide,polypeptide, or protein is secreted from said isolated host cell, on anaffinity chromatography column comprising an affinity ligand bound to asolid phase, or on an adsorption-based, non-affinity chromatographycolumn comprising an adsorption-based, non-affinity chromatographymedium, wherein each of said columns is equilibrated using a mobileloading or eluting phase or an operational variable, and then washedwith an elution gradient to elute isolated host cell proteome peptide-,polypeptide-, and protein-containing fractions from each of saidcolumns; and 2) determining and quantifying the isolated host cellproteome peptides, polypeptides, and proteins present in each of saideluted fractions.
 3. The isolated host cell of claim 2, whereindetermining the isolated host cell proteome peptides, polypeptides, andproteins present in each of said eluted fractions in 2) is performed byliquid chromatography-tandem mass spectroscopy, and quantifying theisolated host cell proteome peptides, polypeptides, and proteins presentin each of said eluted fractions in 2) is performed by either (a)spectral counting or (b) a combination of Bradford protein assay,2-dimensional electrophoresis, and densitometry.
 4. The isolated hostcell of claim 2, wherein scoring and ranking the chromatographicimportance of said isolated host cell proteome peptides, polypeptides,and proteins in adversely affecting the chromatographic separationefficiency of said target recombinant peptide, polypeptide, or proteincomprise: (1) determining for each of said isolated host cell proteomepeptides, polypeptides, and proteins present in each of said elutedfractions: (a) its adsorption strength; (b) its adsorption specificity;(c) its adsorption abundance; (d) its steric effect; and (e) itsmetabolic necessity; (2) multiplying the values of (a) through (e) in 1)for each of said isolated host cell proteome peptides, polypeptides, andproteins to produce an importance score for each of said isolated hostcell proteome peptides, polypeptides, and proteins in each of saideluted fractions; and (3) summing each of said importance scores foreach of said isolated host cell proteome peptides, polypeptides, andproteins in each of said eluted fractions to determine its overallimportance score, and comparing each of said overall importance scoresfor each of said isolated host cell proteome peptides, polypeptides, andproteins to the overall importance scores of each of the other isolatedhost cell proteome peptides, polypeptides, and proteins eluted from saidaffinity chromatography column or said adsorption-based, non-affinitychromatography column, respectively, wherein a higher overall importancescore identifies an isolated host cell proteome peptide, polypeptide, orprotein having a greater adverse effect on the chromatographicseparation efficiency of said target recombinant peptide, polypeptide,or protein than a proteome peptide, polypeptide, or protein having alower overall importance score.
 5. The isolated host cell of claim 4,wherein: adsorption strength in (1)(a) is manifested as the ratio of theconcentration of mobile phase eluent in each of said eluted fractionscompared to the maximum mobile phase eluent concentration with whichsaid column is washed; adsorption specificity in (1)(b) is manifested asthe ratio of the amount of each individual isolated host cell proteomepeptide, polypeptide, or protein present in an eluted fraction comparedto the total amount thereof bound on said column; adsorption abundancein (1)(c) is manifested as the ratio of the amount of each individualisolated host cell proteome peptide, polypeptide, or protein present inan eluted fraction compared to the total amount of all isolated hostcell proteome peptides, polypeptides, and proteins present in saideluted fraction; steric effect in (1)(d) is manifested as the ratio ofthe molecular weight of each of said isolated host cell proteomepeptides, polypeptides, and proteins compared to the molecular weight ofa reference protein within said isolated host cell proteome; andmetabolic necessity in (1)(e)) is assessed by bioinformatics or asdisclosed in published literature.
 6. The isolated host cell of claim 4,wherein scoring and ranking of isolated host cell proteome peptides,polypeptides, and proteins in adversely affecting the chromatographicseparation efficiency of said target recombinant peptide, polypeptide,or protein are performed employing the following equation:${importance}_{i} = {\sum_{j}\lbrack {{b_{1}( \frac{y_{c_{j}}}{y_{\max}} )}( \frac{h_{i,_{j}}}{h_{i,{total}}} )( \frac{h_{i,_{j}}}{h_{j,{total}}} )( \frac{M\; W_{i}}{M\; W_{ref}} )^{\alpha}} \rbrack_{i}}$wherein b₁=scaling parameter; y_(cj) and y_(max)=concentration of mobilephase eluent in fraction (j) and maximum value, respectively; h_(i,j)and h_(i,total)=the amount of protein (i) in fraction (j) and totalbound protein (i), respectively; h_(j,total)=amount of protein infraction (j); MW_(i)=molecular weight of protein (i); MW_(ref)=molecularweight of a reference protein within said isolated host cell proteome;a=steric factor; and i=protein, wherein: ratios of y's and h's adoptvalues between 0 and 1; a protein that remains bound and requiresstringent conditions for elution exhibits a y ratio$( \frac{y_{c_{j}}}{y_{\max}} )$ close to, or equal to,unity; a protein that emerges as a tight peak has an h ratio$( \frac{h_{i,_{j}}}{h_{i,{total}}} )$ close to unity and aj ratio $( \frac{h_{i,_{j}}}{h_{j,{total}}} )$ close tounity if it constitutes the majority of fraction (j); a non-zero αbetween 0 and 1 indicates steric effects; and wherein the quantitativeranking for a protein is calculated by multiplying${b_{1} \times ( \frac{y_{c_{j}}}{y_{\max}} ) \times ( \frac{h_{i,_{j}}}{h_{i,{total}}} ) \times ( \frac{h_{i,_{j}}}{h_{j,{total}}} ) \times ( \frac{M\; W_{i}}{M\; W_{ref}} )^{\alpha}},$and summing the product for each fraction (j) where (i) is present. 7.The isolated host cell of claim 1, which is selected from a bacterium, afungus, a mammalian cell, an insect cell, a plant cell, or a protozoalcell.
 8. The isolated host cell of claim 7, wherein said bacterium isselected from E. coli, L. lactis, B. subtilis, B. licheniformis, B.amyloliquefaciens, P. fluorescens, or C. glutamicum; said fungus is ayeast selected from S. cerevisiae, K. pastoris, or P. methanolica; saidmammalian cell is selected from a CHO cell, a HEK cell, a mouse myelomacell, a BHK cell, or a human retinal cell; said insect cell is selectedfrom an S. frugiperda cell, a T. ni cell, or a D. melanogaster cell;said plant cell is selected from a tobacco cell, an alfalfa cell, a ricecell, a tomato cell, a soybean cell, or an algal cell; and saidprotozoal cell is a L. tarentolae cell.
 9. A method of preparing apharmaceutical or veterinary composition comprising a recombinanttherapeutic peptide, polypeptide, or protein, comprising the steps of:a) expressing said recombinant therapeutic peptide, polypeptide, orprotein in said isolated host cell of claim 1; b) in the case where saidrecombinant therapeutic peptide, polypeptide, or protein is not secretedfrom said isolated host cell, preparing a lysate of said isolated hostcell containing said recombinant therapeutic peptide, polypeptide, orprotein, producing an initial recombinant therapeutic peptide-,polypeptide-, or protein-containing mixture; or c) in the case wheresaid recombinant therapeutic peptide, polypeptide, or protein issecreted from said isolated host cell, harvesting culture medium inwhich said isolated host cell is grown, containing said recombinanttherapeutic peptide, polypeptide, or protein, thereby obtaining aninitial recombinant therapeutic peptide-, polypeptide-, orprotein-containing mixture; d) chromatographing said initial recombinanttherapeutic peptide-, polypeptide-, or protein-containing mixture ofstep b) or step c) on an affinity chromatography column comprising anaffinity ligand bound to a solid phase or on an adsorption-based,non-affinity chromatography column comprising an adsorption-based,non-affinity chromatography medium, and collecting elution fractions,thereby obtaining one or more fractions containing an enriched amount ofsaid recombinant therapeutic peptide, polypeptide, or protein relativeto other peptides, polypeptides, or proteins in said one or morefractions compared to the amount of said recombinant therapeuticpeptide, polypeptide, or protein relative to other peptides,polypeptides, or proteins in said initial recombinant therapeuticpeptide-, polypeptide-, or protein-containing mixture; e) furtherchromatographing an enriched fraction of step d) to obtain saidrecombinant therapeutic peptide, polypeptide, or protein in a desireddegree of purity; f) recovering purified recombinant therapeuticpeptide, polypeptide, or protein of step e); and g) formulating saidpurified recombinant therapeutic peptide, polypeptide, or protein ofstep f) with a pharmaceutically or veterinarily acceptable carrier,diluent, or excipient to produce a pharmaceutical or veterinarycomposition, respectively.
 10. The method of claim 9, wherein said genesthat are deleted, modified, or the expression of which is reduced orcompletely inhibited in said isolated host cell are identified andquantified by: 1) fractionating (a) a lysate of said isolated host cellin the case where said recombinant therapeutic peptide, polypeptide, orprotein is not secreted from said isolated host cell, or (b) the culturemedium in which said isolated host cell is grown in the case where saidrecombinant therapeutic peptide, polypeptide, or protein is secretedfrom said isolated host cell, on an affinity chromatography columncomprising an affinity ligand bound to a solid phase or on anadsorption-based, non-affinity chromatography column comprising anadsorption-based, non-affinity chromatography medium, wherein each ofsaid columns is equilibrated using a mobile loading or eluting phase oran operational variable, and then washed with an elution gradient toelute isolated host cell proteome peptide-, polypeptide-, andprotein-containing fractions from said affinity chromatography column orsaid adsorption-based, non-affinity chromatography column, respectively;and 2) determining and quantifying the isolated host cell proteomepeptides, polypeptides, and proteins present in each of said elutedfractions.
 11. The method of claim 10, wherein scoring and ranking thechromatographic importance of said isolated host cell proteome peptides,polypeptides, and proteins in adversely affecting the chromatographicseparation efficiency of said recombinant therapeutic peptide,polypeptide, or protein comprise: (1) determining for each of saidisolated host cell proteome peptides, polypeptides, and proteins presentin each of said eluted fractions: (a) its adsorption strength,manifested as the ratio of the concentration of mobile phase eluent ineach of said eluted fractions compared to the maximum mobile phaseeluent concentration with which said affinity chromatography column orsaid adsorption-based, non-affinity chromatography column, respectively,is washed; (b) its adsorption specificity, manifested as the ratio ofthe amount of each individual isolated host cell proteome peptide,polypeptide, or protein present in an eluted fraction compared to thetotal amount thereof bound on said affinity chromatography column or onsaid adsorption-based, non-affinity chromatography column, respectively;(c) its adsorption abundance, manifested as the ratio of the amount ofeach individual isolated host cell proteome peptide, polypeptide, orprotein present in an eluted fraction compared to the total amount ofall isolated host cell proteome peptides, polypeptides, and proteinspresent in said eluted fraction; (d) its steric effect, manifested asthe ratio of the molecular weight of each of said isolated host cellproteome peptides, polypeptides, and proteins compared to the molecularweight of a reference protein within said isolated host cell proteome;and (e) its metabolic necessity, assessed by bioinformatics or asdisclosed in published literature; (2) multiplying the values of (a)through (e) in 1) for each of said isolated host cell proteome peptides,polypeptides, and proteins to produce an importance score for each ofsaid isolated host cell proteome peptides, polypeptides, and proteins ineach of said eluted fractions; and (3) summing each of said importancescores for each of said isolated host cell proteome peptides,polypeptides, and proteins in each of said eluted fractions to determineits overall importance score, and comparing each of said overallimportance scores for each of said isolated host cell proteome peptides,polypeptides, and proteins to the overall importance scores of each ofthe other isolated host cell proteome peptides, polypeptides, andproteins eluted from said affinity chromatography column or saidadsorption-based, non-affinity chromatography column, respectively,wherein a higher overall importance score identifies an isolated hostcell proteome peptide, polypeptide, or protein having a greater adverseeffect on the chromatographic separation efficiency of said recombinanttherapeutic peptide, polypeptide, or protein than an isolated host cellproteome peptide, polypeptide, or protein having a lower overallimportance score.
 12. The method of claim 9, wherein said isolated hostcell is selected from a bacterium, a fungus, a mammalian cell, an insectcell, a plant cell, or a protozoal cell.
 13. The method of claim 12,wherein said bacterium is selected from E. coli, L. lactis, B. subtilis,B. licheniformis, B. amyloliquefaciens, P. fluorescens, or C.glutamicum; said fungus is a yeast selected from S. cerevisiae, K.pastoris, or P. methanolica; said mammalian cell is selected from a CHOcell, a HEK cell, a mouse myeloma cell, a BHK cell, or a human retinalcell; said insect cell is selected from an S. frugiperda cell, a T. nicell, or a D. melanogaster cell; said plant cell is selected from atobacco cell, an alfalfa cell, a rice cell, a tomato cell, a soybeancell, or an algal cell; and said protozoal cell is a L. tarentolae cell.14. The method of claim 9, wherein said recombinant therapeutic peptide,polypeptide, or protein is an antibody, an antibody fragment, a vaccine,α₁-Antitrypsin, an enzyme, a growth factor, a blood clotting factor, ahormone, a nerve factor, an interferon, an interleukin, tumor necrosisfactor, lung surfactant protein, or serum albumin.
 15. The method ofclaim 9, wherein said adsorption-based, non-affinity chromatographycolumn is an ion exchange chromatography column.
 16. A method ofpurifying a recombinant enzyme, comprising the steps of: a) expressingsaid recombinant enzyme in said isolated host cell of claim 1; b) in thecase where said recombinant enzyme is not secreted from said isolatedhost cell, preparing a lysate of said isolated host cell containing saidrecombinant enzyme, producing an initial recombinant enzyme-containingmixture; or c) in the case where said recombinant enzyme is secretedfrom said isolated host cell, harvesting culture medium in which saidisolated host cell is grown, containing said recombinant enzyme, therebyobtaining an initial recombinant enzyme-containing mixture; d)chromatographing said initial recombinant enzyme-containing mixture ofstep b) or step c) on an affinity chromatography column comprising anaffinity ligand bound to a solid phase, or on an adsorption-based,non-affinity chromatography column comprising an adsorption-based,non-affinity chromatography medium, and collecting elution fractions,thereby obtaining one or more fraction, containing an enriched amount ofsaid recombinant enzyme relative to other peptides, polypeptides, orproteins in said one or more fractions compared to the amount of saidrecombinant enzyme relative to other peptides, polypeptides, or proteinsin said initial recombinant enzyme-containing mixture; e) optionally,further chromatographing an enriched fraction of step d) to obtain saidrecombinant enzyme in a desired degree of purity; and f) recoveringpurified recombinant enzyme.
 17. The method of claim 16, wherein saidgenes that are deleted, modified, or the expression of which is reducedor completely inhibited in said isolated host cell are identified andquantified by: 1) fractionating (a) a lysate of said isolated host cellin the case where said recombinant enzyme is not secreted from saidisolated host cell, or (b) the culture medium in which said isolatedhost cell is grown in the case where said recombinant enzyme is secretedfrom said isolated host cell, on an affinity chromatography columncomprising an affinity ligand bound to a solid phase, or on anadsorption-based, non-affinity chromatography column comprising anadsorption-based, non-affinity chromatography medium, wherein each ofsaid columns is equilibrated using a mobile loading or eluting phase oran operational variable, and then washed with an elution gradient toelute isolated host cell proteome peptide-, polypeptide-, andprotein-containing fractions from said affinity chromatography column orsaid adsorption-based, non-affinity chromatography column, respectively;and 2) determining and quantifying the isolated host cell proteomepeptides, polypeptides, and proteins present in each of said elutedfractions.
 18. The method of claim 17, wherein scoring and ranking thechromatographic importance of said isolated host cell proteome peptides,polypeptides, and proteins in adversely affecting the chromatographicseparation efficiency of said recombinant enzyme comprise: (1)determining for each of said isolated host cell proteome peptides,polypeptides, and proteins present in each of said eluted fractions: (a)its adsorption strength, manifested as the ratio of the concentration ofmobile phase eluent in each of said eluted fractions compared to themaximum mobile phase eluent concentration with which said column iswashed; (b) its adsorption specificity, manifested as the ratio of theamount of each individual isolated host cell proteome peptide,polypeptide, or protein present in an eluted fraction compared to thetotal amount thereof bound on said affinity chromatography column or onsaid adsorption-based, non-affinity chromatography column, respectively;(c) its adsorption abundance, manifested as the ratio of the amount ofeach individual isolated host cell proteome peptide, polypeptide, orprotein present in an eluted fraction compared to the total amount ofall isolated host cell proteome peptides, polypeptides, and proteinspresent in said eluted fraction; (d) its steric effect, manifested asthe ratio of the molecular weight of each of said isolated host cellproteome peptides, polypeptides, and proteins compared to the molecularweight of a reference protein within said isolated host cell proteome;and (e) its metabolic necessity, assessed by bioinformatics or asdisclosed in published literature; (2) multiplying the values of (a)through (e) in (1) for each of said isolated host cell peptides,polypeptides, and proteins to produce an importance score for each ofsaid isolated host cell proteome peptides, polypeptides, and proteins ineach of said eluted fractions; and (3) summing each of said importancescores for each of said isolated host cell proteome peptides,polypeptides, and proteins in each of said eluted fractions to determineits overall importance score, and comparing each of said overallimportance scores for each of said isolated host cell proteome peptides,polypeptides, and proteins to the overall importance scores of each ofthe other isolated host cell proteome peptides, polypeptides, andproteins eluted from said affinity chromatography column or saidadsorption-based, non-affinity chromatography column, respectively;wherein a higher overall importance score identifies an isolated hostcell proteome peptide, polypeptide, or protein having a greater adverseeffect on the chromatographic separation efficiency of said recombinantenzyme than an isolated host cell proteome peptide, polypeptide, orprotein having a lower overall importance score.
 19. The method of claim16, wherein said isolated host cell is selected from a bacterium, afungus, a mammalian cell, an insect cell, a plant cell, or a protozoalcell.
 20. The method of claim 19, wherein said bacterium is selectedfrom E. coli, L. lactis, B. subtilis, B. licheniformis, B.amyloliquefaciens, P. fluorescens, or C. glutamicum; said fungus is ayeast selected from S. cerevisiae, K. pastoris, or P. methanolica; saidmammalian cell is selected from a CHO cell, a HEK cell, a mouse myelomacell, a BHK cell, or a human retinal cell; said insect cell is selectedfrom an S. frugiperda cell, a T. ni cell, or a D. melanogaster cell;said plant cell is selected from a tobacco cell, an alfalfa cell, a ricecell, a tomato cell, a soybean cell, or an algal cell; and saidprotozoal cell is a L. tarentolae cell.
 21. The method of claim 16,wherein said adsorption-based, non-affinity chromatography column is anion exchange chromatography column.